Portrait de Charles-Etienne Joseph n'est pas disponible

Charles-Etienne Joseph

Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e
Co-supervisor
Sujets de recherche
Apprentissage de représentations
Apprentissage profond
Optimisation
Systèmes distribués

Publications

Continual Pre-training of MoEs: How robust is your router?
Zain Sarwar
Ashwinee Panda
Anirban Das
Shi-Xiong Zhang
Stephen Rawls
Sambit Sahu
Continual Pre-training of MoEs: How robust is your router?
Zain Sarwar
Ashwinee Panda
Anirban Das
Shi-Xiong Zhang
Stephen Rawls
Sambit Sahu
Meta-learning Optimizers for Communication-Efficient Learning
$\mu$LO: Compute-Efficient Meta-Generalization of Learned Optimizers
μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
μLO: Compute-Efficient Meta-Generalization of Learned Optimizers
Can We Learn Communication-Efficient Optimizers?
Learning Optimizers for Local SGD
Learning Optimizers for Local SGD
Communication-efficient variants of SGD, specifically local SGD, have received a great deal of interest in recent years. These approaches co… (voir plus)mpute multiple gradient steps locally, that is on each worker, before averaging model parameters, helping relieve the critical communication bottleneck in distributed deep learning training. Although many variants of these approaches have been proposed, they can sometimes lag behind state-of-the-art optimizers for deep learning. In this work, we incorporate local optimizers that compute multiple updates into a learned optimization framework, allowing to meta-learn potentially more efficient local SGD algorithms. Our results demonstrate that local learned optimizers can substantially outperform local SGD and its sophisticated variants while maintaining their communication efficiency. We show that the learned optimizers can generalize to new datasets and architectures, demonstrating the potential of learned optimizers for improving communication-efficient distributed learning.