Manuela Girotti

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD

Alireza Mousavi-Hosseini

Sejun Park

Murat A. Erdogdu

We study the problem of training a two-layer neural network (NN) of arbitrary width using stochastic gradient descent (SGD) where the input …

2023-01-31

ICLR.cc/2023/Conference (notable)

doi.org

openreview.net

Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

Manuela Girotti

Ioannis Mitliagkas

Gauthier Gidel

We theoretically analyze the Feedback Alignment (FA) algorithm, an efficient alternative to backpropagation for training neural networks. We… (see more) provide convergence guarantees with rates for deep linear networks for both continuous and discrete dynamics. Additionally, we study incremental learning phenomena for shallow linear networks. Interestingly, certain specific initializations imply that negligible components are learned before the principal ones, thus potentially negatively affecting the effectiveness of such a learning algorithm; a phenomenon we classify as implicit anti-regularization. We also provide initialization schemes where the components of the problem are approximately learned by decreasing order of importance, thus providing a form of implicit regularization.

2021-10-19

arXiv (Cornell University) (preprint)

doi.org

arxiv.org

A Study of Condition Numbers for First-Order Optimization

Charles Guille-escuret

Baptiste Goujaud

Manuela Girotti

Ioannis Mitliagkas

The study of first-order optimization algorithms (FOA) typically starts with assumptions on the objective functions, most commonly smoothnes… (see more)s and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances.

2021-03-17

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (published)

doi.org

proceedings.mlr.press

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Manuela Girotti

Publications

AI Policy Fellowship Publications

Mila Ventures Launchpad

AI Policy Compass

Popular keywords:

Manuela Girotti

Publications