Scalar Invariant Networks with Zero Bias
Chuqin Geng
Xiaojie Xu
Haolin Ye
Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are th… (see more)ought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe dropping bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.
Symmetry Breaking and Equivariant Neural Networks
Sékou-Oumar Kaba
Using symmetry as an inductive bias in deep learning has been proven to be a principled approach for sample-efficient model design. However,… (see more) the relationship between symmetry and the imperative for equivariance in neural networks is not always obvious. Here, we analyze a key limitation that arises in equivariant functions: their incapacity to break symmetry at the level of individual data samples. In response, we introduce a novel notion of 'relaxed equivariance' that circumvents this limitation. We further demonstrate how to incorporate this relaxation into equivariant multilayer perceptrons (E-MLPs), offering an alternative to the noise-injection method. The relevance of symmetry breaking is then discussed in various application domains: physics, graph representation learning, combinatorial optimization and equivariant decoding.
On the Information Geometry of Vision Transformers
Sonia Joseph
Kumar Krishna Agrawal
Arna Ghosh
On the Varied Faces of Overparameterization in Supervised and Self-Supervised Learning
Matteo Gamba
Arna Ghosh
Kumar Krishna Agrawal
Agrawal
Hossein Azizpour
Mårten Björkman
The quality of the representations learned by neural networks depends on several factors, including the loss function, learning algorithm, a… (see more)nd model architecture. In this work, we use information geometric measures to assess the representation quality in a principled manner. We demonstrate that the sensitivity of learned representations to input perturbations, measured by the spectral norm of the feature Jacobian, provides valuable information about downstream generalization. On the other hand, measuring the coefficient of spectral decay observed in the eigenspectrum of feature covariance provides insights into the global representation geometry. First, we empirically establish an equivalence between these notions of representation quality and show that they are inversely correlated. Second, our analysis reveals the varying roles that overparameterization plays in improving generalization. Unlike supervised learning, we observe that increasing model width leads to higher discriminability and less smoothness in the self-supervised regime. Furthermore, we report that there is no observable double descent phenomenon in SSL with non-contrastive objectives for commonly used parameterization regimes, which opens up new opportunities for tight asymptotic analysis. Taken together, our results provide a loss-aware characterization of the different role of overparameterization in supervised and self-supervised learning.
Author Correction: 30×30 biodiversity gains rely on national coordination
Isaac Eckert
Andrea Brown
Dominique Caron
Federico Riva
Exploring the multidimensional nature of repetitive and restricted behaviors and interests (RRBI) in autism: neuroanatomical correlates and clinical implications
Aline Lefebvre
Nicolas Traut
Amandine Pedoux
Anna Maruani
Anita Beggiato
Monique Elmaleh
David Germanaud
Anouck Amestoy
Myriam Ly‐Le Moal
Christopher H. Chatham
Lorraine Murtagh
Manuel Bouvard
Marianne Alisson
Marion Leboyer
Thomas Bourgeron
Roberto Toro
Clara A. Moreau
Richard Delorme
Exploring the multidimensional nature of repetitive and restricted behaviors and interests (RRBI) in autism: neuroanatomical correlates and clinical implications
Aline Lefebvre
Nicolas Traut
Amandine Pedoux
Anna Maruani
Anita Beggiato
Monique Elmaleh
David Germanaud
Anouck Amestoy
Myriam Ly‐Le Moal
Christopher H. Chatham
Lorraine Murtagh
Manuel Bouvard
Marianne Alisson
Marion Leboyer
Thomas Bourgeron
Roberto Toro
Clara A. Moreau
Richard Delorme
Exploring the multidimensional nature of repetitive and restricted behaviors and interests (RRBI) in autism: neuroanatomical correlates and clinical implications
Aline Lefebvre
Nicolas Traut
Amandine Pedoux
Anna Maruani
Anita Beggiato
Monique Elmaleh
David Germanaud
Anouck Amestoy
Myriam Ly‐Le Moal
Christopher H. Chatham
Lorraine Murtagh
Manuel Bouvard
Marianne Alisson
Marion Leboyer
Thomas Bourgeron
Roberto Toro
Clara A. Moreau
Richard Delorme
scGeneRythm: Using Neural Networks and Fourier Transformation to Cluster Genes by Time-Frequency Patterns in Single-Cell Data
Yiming Jia
Hao Wu
The search for the lost attractor
Mario Pasquato
Syphax Haddad
Pierfrancesco Di Cintio
Alexandre Adam
Pablo Lemos
No'e Dia
Mircea Petrache
Ugo Niccolo Di Carlo
Alessandro A. Trani
Hessian Aware Low-Rank Perturbation for Order-Robust Continual Learning
Jiaqi Li
Rui Wang
Yuanhao Lai
Changjian Shui
Sabyasachi Sahoo
Charles Ling
Shichun Yang
Boyu Wang
Fan Zhou
Unlearning via Sparse Representations
Vedant Shah
Frederik Träuble
Ashish Malik
Michael Curtis Mozer
Sanjeev Arora
Anirudh Goyal
Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infea… (see more)sible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the data set. We evaluate the proposed technique on the problem of \textit{class unlearning} using three datasets: CIFAR-10, CIFAR-100, and LACUNA-100. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all three datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.