Arna Ghosh

2023-11-28

NeurIPS.cc/2023/Workshop/NeurReps (poster)

On the Varied Faces of Overparameterization in Supervised and Self-Supervised Learning

Matteo Gamba

Kumar Krishna Agrawal

Agrawal

Hossein Azizpour

Mårten Björkman

The quality of the representations learned by neural networks depends on several factors, including the loss function, learning algorithm, a… (see more)nd model architecture. In this work, we use information geometric measures to assess the representation quality in a principled manner. We demonstrate that the sensitivity of learned representations to input perturbations, measured by the spectral norm of the feature Jacobian, provides valuable information about downstream generalization. On the other hand, measuring the coefficient of spectral decay observed in the eigenspectrum of feature covariance provides insights into the global representation geometry. First, we empirically establish an equivalence between these notions of representation quality and show that they are inversely correlated. Second, our analysis reveals the varying roles that overparameterization plays in improving generalization. Unlike supervised learning, we observe that increasing model width leads to higher discriminability and less smoothness in the self-supervised regime. Furthermore, we report that there is no observable double descent phenomenon in SSL with non-contrastive objectives for commonly used parameterization regimes, which opens up new opportunities for tight asymptotic analysis. Taken together, our results provide a loss-aware characterization of the different role of overparameterization in supervised and self-supervised learning.

2023-11-28

NeurIPS.cc/2023/Workshop/NeurReps (poster)

The feature landscape of visual cortex

Rudi Tong

Ronan da Silva

Dongyan Lin

James Wilsenach

Stuart Trenholm

Understanding computations in the visual system requires a characterization of the distinct feature preferences of neurons in different visu… (see more)al cortical areas. However, we know little about how feature preferences of neurons within a given area relate to that area’s role within the global organization of visual cortex. To address this, we recorded from thousands of neurons across six visual cortical areas in mouse and leveraged generative AI methods combined with closed-loop neuronal recordings to identify each neuron’s visual feature preference. First, we discovered that the mouse’s visual system is globally organized to encode features in a manner invariant to the types of image transformations induced by self-motion. Second, we found differences in the visual feature preferences of each area and that these differences generalized across animals. Finally, we observed that a given area’s collection of preferred stimuli (‘own-stimuli’) drive neurons from the same area more effectively through their dynamic range compared to preferred stimuli from other areas (‘other-stimuli’). As a result, feature preferences of neurons within an area are organized to maximally encode differences among own-stimuli while remaining insensitive to differences among other-stimuli. These results reveal how visual areas work together to efficiently encode information about the external world.

2023-11-05

bioRxiv (preprint)

doi.org

Learning better with Dale’s Law: A Spectral Perspective

How gradient estimator variance and bias impact learning in neural networks

Yuhan Helena Liu

Guillaume Lajoie

Konrad Paul Kording

There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chi… (see more)ps. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.

2023-02-01

ICLR.cc/2023/Conference (poster)

H OW GRADIENT ESTIMATOR VARIANCE AND BIAS COULD IMPACT LEARNING IN NEURAL CIRCUITS

Yuhan Helena Liu

Guillaume Lajoie

Konrad K¨ording

There is growing interest in understanding how real brains may approximate gradients and how gradients can be used to train neuromorphic chi… (see more)ps. However, neither real brains nor neuromorphic chips can perfectly follow the loss gradient, so parameter updates would necessarily use gradient estimators that have some variance and/or bias. Therefore, there is a need to understand better how variance and bias in gradient estimators impact learning dependent on network and task properties. Here, we show that variance and bias can impair learning on the training data, but some degree of variance and bias in a gradient estimator can be beneficial for generalization. We find that the ideal amount of variance and bias in a gradient estimator are dependent on several properties of the network and task: the size and activity sparsity of the network, the norm of the gradient, and the curvature of the loss landscape. As such, whether considering biologically-plausible learning algorithms or algorithms for training neuromorphic chips, researchers can analyze these properties to determine whether their approximation to gradient descent will be effective for learning given their network and task properties.

2023-01-01

(published)

www.semanticscholar.org

Current State and Future Directions for Learning in Biological Recurrent Neural Networks: A Perspective Piece

Luke Y. Prince

Roy Henha Eyono

Ellen Boven

Joseph Pemberton

Franz Scherr

Claudia Clopath

Rui Ponte Costa

Wolfgang Maass

Cristina Savin

Katharina Wilmes

We provide a brief review of the common assumptions about biological learning with findings from experimental neuroscience and contrast them… (see more) with the efficiency of gradient-based learning in recurrent neural networks. The key issues discussed in this review include: synaptic plasticity, neural circuits, theory-experiment divide, and objective functions. We conclude with recommendations for both theoretical and experimental neuroscientists when designing new studies that could help bring clarity to these issues.

2022-04-27

Neurons, Behavior, Data analysis, and Theory (published)

doi.org

arxiv.org

$\alpha$-ReQ : Assessing Representation Quality in Self-Supervised Learning by measuring eigenspectrum decay

Kumar Krishna Agrawal

Arnab Kumar Mondal

Self-Supervised Learning (SSL) with large-scale unlabelled datasets enables learning useful representations for multiple downstream tasks. H… (see more)owever, assessing the quality of such representations efficiently poses nontrivial challenges. Existing approaches train linear probes (with frozen features) to evaluate performance on a given task. This is expensive both computationally, since it requires retraining a new prediction head for each downstream task, and statistically, requires task-specific labels for multiple tasks. This poses a natural question, how do we efficiently determine the "goodness" of representations learned with SSL across a wide range of potential downstream tasks? In particular, a task-agnostic statistical measure of representation quality, that predicts generalization without explicit downstream task evaluation, would be highly desirable. In this work, we analyze characteristics of learned representations

Beyond accuracy: generalization properties of bio-plausible temporal credit assignment rules

Yuhan Helena Liu

Eric Todd SheaBrown

On the Varied Faces of Overparameterization in Supervised and Self-Supervised Learning

Matteo Gamba