Abhinav Moudgil

Can We Learn Communication-Efficient Optimizers?

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

Eugene Belilovsky

2023-12-02

ArXiv (preprint)

doi.org

arxiv.org

Learning Optimizers for Local SGD

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

Eugene Belilovsky

2023-10-27

NeurIPS.cc/2023/Workshop/Federated_Learning (poster)

openreview.net

Learning to Optimize with Recurrent Hierarchical Transformers

Abhinav Moudgil

Boris Knyazev

Guillaume Lajoie

Eugene Belilovsky

2023-06-19

ICML.cc/2023/Workshop/Frontiers4LCD (published)

openreview.net

Towards Scaling Difference Target Propagation by Learning Backprop Targets

Maxence Ernoult

Fabrice Normandin

Abhinav Moudgil

Sean Spinney

The development of biologically-plausible learning algorithms is important for understanding learning in the brain, but most of them fail to… (see more) scale-up to real-world tasks, limiting their potential as explanations for learning by real brains. As such, it is important to explore learning algorithms that come with strong theoretical guarantees and can match the performance of backpropagation (BP) on complex tasks. One such algorithm is Difference Target Propagation (DTP), a biologically-plausible learning algorithm whose close relation with Gauss-Newton (GN) optimization has been recently established. However, the conditions under which this connection rigorously holds preclude layer-wise training of the feedback pathway synaptic weights (which is more biologically plausible). Moreover, good alignment between DTP weight updates and loss gradients is only loosely guaranteed and under very specific conditions for the architecture being trained. In this paper, we propose a novel feedback weight training scheme that ensures both that DTP approximates BP and that layer-wise feedback weight training can be restored without sacrificing any theoretical guarantees. Our theory is corroborated by experimental results and we report the best performance ever achieved by DTP on CIFAR-10 and ImageNet 32

2022-06-28

Proceedings of the 39th International Conference on Machine Learning (published)

proceedings.mlr.press

arxiv.org

Learning Optimizers for Local SGD

Charles-Étienne Joseph

Benjamin Thérien

Abhinav Moudgil

Boris Knyazev

Eugene Belilovsky

Communication-efficient variants of SGD, specifically local SGD, have received a great deal of interest in recent years. These approaches co… (see more)mpute multiple gradient steps locally, that is on each worker, before averaging model parameters, helping relieve the critical communication bottleneck in distributed deep learning training. Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art optimizers for deep learning. In this work, we incorporate local optimizers that compute multiple updates into a learned optimization framework, allowing to meta-learn potentially more efficient local SGD algorithms. Our results demonstrate that local learned optimizers can substantially outperform local SGD and its sophisticated variants while maintaining their communication efficiency. We show that the learned optimizers can generalize to new datasets and architectures, demonstrating the potential of learned optimizers for improving communication-efficient distributed learning.

2000-01-01

(published)

www.semanticscholar.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Abhinav Moudgil

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Abhinav Moudgil

Publications