Portrait de Courtney Paquette

Courtney Paquette

Membre académique associé
Chaire en IA Canada-CIFAR
Professeure adjointe, McGill University, Département de mathématiques et statistiques
Chercheuse scientifique, Google Brain
Sujets de recherche
Optimisation

Biographie

Courtney Paquette est professeure adjointe à l'Université McGill et titulaire d'une chaire en IA Canada-CIFAR à Mila – Institut québécois d’intelligence artificielle. Sa recherche se concentre sur la conception et l'analyse d'algorithmes pour les problèmes d'optimisation à grande échelle, et vise des applications en science des données. Courtney Paquette a obtenu un doctorat en mathématiques de l'Université de Washington (2017), a occupé des postes postdoctoraux à l'Université Lehigh (2017-2018) et à l'Université de Waterloo (bourse postdoctorale de la NSF, 2018-2019), et a été chercheuse scientifique chez Google Research, Brain Montréal (2019-2020).

Étudiants actuels

Maîtrise recherche - McGill
Postdoctorat - McGill
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Doctorat - McGill

Publications

Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette
Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette
We investigate scaling laws for stochastic momentum algorithms with small batch on the power law random features model, parameterized by dat… (voir plus)a complexity, target complexity, and model size. When trained with a stochastic momentum algorithm, our analysis reveals four distinct loss curve shapes determined by varying data-target complexities. While traditional stochastic gradient descent with momentum (SGD-M) yields identical scaling law exponents to SGD, dimension-adapted Nesterov acceleration (DANA) improves these exponents by scaling momentum hyperparameters based on model size and data complexity. This outscaling phenomenon, which also improves compute-optimal scaling behavior, is achieved by DANA across a broad range of data and target complexities, while traditional methods fall short. Extensive experiments on high-dimensional synthetic quadratics validate our theoretical predictions and large-scale text experiments with LSTMs show DANA's improved loss exponents over SGD hold in a practical setting.
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Pierre Marion
Anna Korba
Peter Bartlett
Mathieu Blondel
Valentin De Bortoli
Arnaud Doucet
Felipe Llinares-López
Quentin Berthet
High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems
Andrew Nicholas Cheng
Kiwon Lee
We analyze the dynamics of general mini-batch first order algorithms on the …
4+3 Phases of Compute-Optimal Neural Scaling Laws
Elliot Paquette
Lechao Xiao
Jeffrey Pennington
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begoña García Malaxechebarría
Andrew Mackenzie
Elliot Paquette
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems extended abstract
Tomas Gonzalez
Cristobal Guzman
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Pierre Marion
Anna Korba
Peter Bartlett
Mathieu Blondel
Valentin De Bortoli
Arnaud Doucet
Felipe Llinares-L'opez
Quentin Berthet
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems
Tom'as Gonz'alez
Crist'obal Guzm'an
Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models
Elizabeth Collins-Woodfin
Elliot Paquette
Inbar Seroussi
Only tails matter: Average-Case Universality and Robustness in the Convex Regime
Leonardo Cunha
Fabian Pedregosa
Damien Scieur
Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime
Leonardo Cunha
Fabian Pedregosa
Damien Scieur