Portrait of Courtney Paquette

Courtney Paquette

Associate Academic Member
Canada CIFAR AI Chair
Assistant Professor, McGill University, Department of Mathematics and Statistics
Research Scientist, Google Brain
Research Topics
Optimization

Biography

Courtney Paquette is an assistant professor at McGill University and a Canada CIFAR AI Chair at Mila – Quebec Artificial Intelligence Institute.

Her research focuses on designing and analyzing algorithms for large-scale optimization problems, motivated by applications in data science.

She received her PhD in mathematics from the University of Washington (2017), held postdoctoral positions at Lehigh University (2017–2018) and the University of Waterloo (NSF postdoctoral fellowship, 2018–2019), and was a research scientist at Google Brain in Montréal (2019–2020).

Current Students

Master's Research - McGill University
Postdoctorate - McGill University
PhD - Université de Montréal
Principal supervisor :
Master's Research - McGill University
PhD - McGill University
Principal supervisor :
Master's Research - McGill University
PhD - McGill University

Publications

Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette
Dimension-adapted Momentum Outscales SGD
Damien Ferbach
Katie Everett
Elliot Paquette
We investigate scaling laws for stochastic momentum algorithms with small batch on the power law random features model, parameterized by dat… (see more)a complexity, target complexity, and model size. When trained with a stochastic momentum algorithm, our analysis reveals four distinct loss curve shapes determined by varying data-target complexities. While traditional stochastic gradient descent with momentum (SGD-M) yields identical scaling law exponents to SGD, dimension-adapted Nesterov acceleration (DANA) improves these exponents by scaling momentum hyperparameters based on model size and data complexity. This outscaling phenomenon, which also improves compute-optimal scaling behavior, is achieved by DANA across a broad range of data and target complexities, while traditional methods fall short. Extensive experiments on high-dimensional synthetic quadratics validate our theoretical predictions and large-scale text experiments with LSTMs show DANA's improved loss exponents over SGD hold in a practical setting.
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Pierre Marion
Anna Korba
Peter Bartlett
Mathieu Blondel
Valentin De Bortoli
Arnaud Doucet
Felipe Llinares-López
Quentin Berthet
High Dimensional First Order Mini-Batch Algorithms on Quadratic Problems
Andrew Nicholas Cheng
Kiwon Lee
We analyze the dynamics of general mini-batch first order algorithms on the …
4+3 Phases of Compute-Optimal Neural Scaling Laws
Elliot Paquette
Lechao Xiao
Jeffrey Pennington
The High Line: Exact Risk and Learning Rate Curves of Stochastic Adaptive Learning Rate Algorithms
Elizabeth Collins-Woodfin
Inbar Seroussi
Begoña García Malaxechebarría
Andrew Mackenzie
Elliot Paquette
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems extended abstract
Tomas Gonzalez
Cristobal Guzman
Implicit Diffusion: Efficient Optimization through Stochastic Sampling
Pierre Marion
Anna Korba
Peter Bartlett
Mathieu Blondel
Valentin De Bortoli
Arnaud Doucet
Felipe Llinares-L'opez
Quentin Berthet
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems
Tom'as Gonz'alez
Crist'obal Guzm'an
Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models
Elizabeth Collins-Woodfin
Elliot Paquette
Inbar Seroussi
Only tails matter: Average-Case Universality and Robustness in the Convex Regime
Leonardo Cunha
Fabian Pedregosa
Damien Scieur
Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime
Leonardo Cunha
Fabian Pedregosa
Damien Scieur