Publications

Learning few-shot imitation as cultural transmission
Avishkar Bhoopchand
Bethanie Brownfield
Adrian Collister
Agustin Dal Lago
Ashley Edwards
Richard Everett
Alexandre Fréchette
Yanko Gitahy Oliveira
Edward Hughes
Piermaria Mendolicchio
Julia Pawar
Miruna Pȋslar
Alex Platonov
Evan Senter
Sukhdeep Singh
Alexander Zacherl
Lei M Zhang
Scalar Invariant Networks with Zero Bias
Chuqin Geng
Xiaojie Xu
Haolin Ye
Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are th… (voir plus)ought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe dropping bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.
Symmetry Breaking and Equivariant Neural Networks
Sékou-Oumar Kaba
Using symmetry as an inductive bias in deep learning has been proven to be a principled approach for sample-efficient model design. However,… (voir plus) the relationship between symmetry and the imperative for equivariance in neural networks is not always obvious. Here, we analyze a key limitation that arises in equivariant functions: their incapacity to break symmetry at the level of individual data samples. In response, we introduce a novel notion of 'relaxed equivariance' that circumvents this limitation. We further demonstrate how to incorporate this relaxation into equivariant multilayer perceptrons (E-MLPs), offering an alternative to the noise-injection method. The relevance of symmetry breaking is then discussed in various application domains: physics, graph representation learning, combinatorial optimization and equivariant decoding.
On the Information Geometry of Vision Transformers
Sonia Joseph
Kumar Krishna Agrawal
Arna Ghosh
On the Varied Faces of Overparameterization in Supervised and Self-Supervised Learning
Matteo Gamba
Arna Ghosh
Kumar Krishna Agrawal
Agrawal
Hossein Azizpour
Mårten Björkman
The quality of the representations learned by neural networks depends on several factors, including the loss function, learning algorithm, a… (voir plus)nd model architecture. In this work, we use information geometric measures to assess the representation quality in a principled manner. We demonstrate that the sensitivity of learned representations to input perturbations, measured by the spectral norm of the feature Jacobian, provides valuable information about downstream generalization. On the other hand, measuring the coefficient of spectral decay observed in the eigenspectrum of feature covariance provides insights into the global representation geometry. First, we empirically establish an equivalence between these notions of representation quality and show that they are inversely correlated. Second, our analysis reveals the varying roles that overparameterization plays in improving generalization. Unlike supervised learning, we observe that increasing model width leads to higher discriminability and less smoothness in the self-supervised regime. Furthermore, we report that there is no observable double descent phenomenon in SSL with non-contrastive objectives for commonly used parameterization regimes, which opens up new opportunities for tight asymptotic analysis. Taken together, our results provide a loss-aware characterization of the different role of overparameterization in supervised and self-supervised learning.
Author Correction: 30×30 biodiversity gains rely on national coordination
Isaac Eckert
Andrea Brown
Dominique Caron
Federico Riva
Exploring the multidimensional nature of repetitive and restricted behaviors and interests (RRBI) in autism: neuroanatomical correlates and clinical implications
Aline Lefebvre
Nicolas Traut
Amandine Pedoux
Anna Maruani
Anita Beggiato
Monique Elmaleh
David Germanaud
Anouck Amestoy
Myriam Ly‐Le Moal
Christopher H. Chatham
Lorraine Murtagh
Manuel Bouvard
Marianne Alisson
Marion Leboyer
Thomas Bourgeron
Roberto Toro
Clara A. Moreau
Richard Delorme
scGeneRythm: Using Neural Networks and Fourier Transformation to Cluster Genes by Time-Frequency Patterns in Single-Cell Data
Yiming Jia
Hao Wu
The search for the lost attractor
Mario Pasquato
Syphax Haddad
Pierfrancesco Di Cintio
Alexandre Adam
Pablo Lemos
No'e Dia
Mircea Petrache
Ugo Niccolo Di Carlo
Alessandro A. Trani
Mitigating Biases with Diverse Ensembles and Diffusion Models
Luca Scimeca
Alexander Rubinstein
Damien Teney
Seong Joon Oh
Armand Nicolicioiu
Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut lea… (voir plus)rning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones. In this work, we propose an ensemble diversification framework exploiting Diffusion Probabilistic Models (DPMs) to mitigate this form of bias. We show that at particular training intervals, DPMs can generate images with novel feature combinations, even when trained on samples displaying correlated input features. We leverage this crucial property to generate synthetic counterfactuals to increase model diversity via ensemble disagreement. We show that DPM-guided diversification is sufficient to remove dependence on primary shortcut cues, without a need for additional supervised signals. We further empirically quantify its efficacy on several diversification objectives, and finally show improved generalization and diversification performance on par with prior work that relies on auxiliary data collection.
Propositional Logics for the Lawvere Quantale
Giorgio Bacci
Radu Mardare
Gordon Plotkin
scSniper: Single-cell Deep Neural Network-based Identification of Prominent Biomarkers
Mingyang Li
Yanshuo Chen