Quentin Fournier

quentin.fournier@mila.quebec

Research Fellow, Talent and Ecosystem

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

Publications

Manifold Metric: A Loss Landscape Approach for Predicting Model Performance

Pranshu Malviya

Jerry Huang

Quentin Fournier

Sarath Chandar

Determining the optimal model for a given task often requires training multiple models from scratch, which becomes impractical as dataset an… (see more)d model sizes grow. A more efficient alternative is to expand smaller pre-trained models, but this approach is underutilized due to a limited understanding of its impact on the training dynamics. Existing methods for quantifying this impact have notable limitations, including computation cost. To address this, we introduce a new perspective based on the loss landscape, which has been shown to contain a manifold of linearly connected minima. Specifically, we propose a metric that estimates the size of this manifold to study the impact of model expansion. Our experiments reveal a strong correlation between performance gains and our manifold metric, enabling more informed model comparison and offering a first step toward a geometry-driven approach for reliable model expansion. Notably, our metric outperforms other baselines, even when different types of expansion with equivalent number of parameters are applied to a model.

2024-05-24

ArXiv (preprint)

arxiv.org

Predicting the Impact of Model Expansion through the Minima Manifold: A Loss Landscape Perspective

Pranshu Malviya

Jerry Huang

Quentin Fournier

Sarath Chandar

The optimal model for a given task is often challenging to determine, requiring training multiple models from scratch which becomes prohibit… (see more)ive as dataset and model sizes grow. A more efficient alternative is to reuse smaller pre-trained models by expanding them, however, this is not widely adopted as how this impacts training dynamics remains poorly understood. While prior works have introduced statistics to measure these effects, they remain flawed. To rectify this, we offer a new approach for understanding and quantifying the impact of expansion through the lens of the loss landscape, which has been shown to contain a manifold of linearly connected minima. Building on this new perspective, we propose a metric to study the impact of expansion by estimating the size of the manifold. Experimental results show a clear relationship between gains in performance and manifold size, enabling the comparison of candidate models and presenting a first step towards expanding models more reliably based on geometric properties of the loss landscape.

2024-05-24

ArXiv (preprint)

doi.org

arxiv.org

Exploring Quantization for Efficient Pre-Training of Transformer Language Models

Kamran Chitsaz

Quentin Fournier

Goncalo Mordido

Sarath Chandar

The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has p… (see more)roven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during pre-training has remained largely unexplored at scale for language modeling. This study aims to explore the impact of quantization for efficient pre-training of Transformers, with a focus on linear layer components. By systematically applying straightforward linear quantization to weights, activations, gradients, and optimizer states, we assess its effects on model efficiency, stability, and performance during training. By offering a comprehensive recipe of effective quantization strategies to be applied during the pre-training of Transformers, we promote high training efficiency from scratch while retaining language modeling ability. Code is available at https://github.com/chandar-lab/EfficientLLMs.

2024-01-01

EMNLP (Findings) (published)

doi.org

arxiv.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Quentin Fournier

Blog Posts

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Quentin Fournier

Blog Posts

Publications