Sarath Chandar

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Ista Abbes

Master's Research - Université de Montréal

Alex Aselstyne

Master's Research - Polytechnique Montréal

Davide Baldelli

PhD - Polytechnique Montréal

Co-supervisor :

Milan Bhan

Collaborating researcher

Diego Cerda Mardini

Master's Research - McGill University

Antoine Clavaud

Master's Research - Polytechnique Montréal

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

David Heurtel--Depeiges

PhD - Polytechnique Montréal

Jerry Huang

PhD - Université de Montréal

Saurav Jha

Postdoctorate - Polytechnique Montréal

Amir Kalantari Dehaghi

Collaborating Alumni

Lola Le Breton

PhD - Polytechnique Montréal

Aidan Li

Master's Research - Université de Montréal

Co-supervisor :

Postdoctorate - Université de Montréal

PhD - Polytechnique Montréal

Roshan Munirathinam Sankaran Balaji

Research Intern - Polytechnique Montréal

Rayen Nacef

Collaborating researcher - Polytechnique Montréal

Hadi NekoeiQachkanloo

PhD - Université de Montréal

Nilaksh Nilaksh

PhD - Polytechnique Montréal

PhD - Université de Montréal

Linda Peinthiere

Collaborating researcher - Polytechnique Montréal Montreal

Gabriele Prato

PhD - Université de Montréal

Postdoctorate

Shaipranesh Senthilkumar

PhD - Polytechnique Montréal

Arjun Vaithilingam Sudhakar

Nour Shaheen

Master's Research - Polytechnique Montréal

Principal supervisor :

PhD - Polytechnique Montréal

Megh Thakkar

Master's Research - Université de Montréal

PhD - Polytechnique Montréal

Shawn Whitfield

Collaborating researcher

Kowen Woo

Research Intern - Polytechnique Montréal

Anabel XL

Postdoctorate - Université de Montréal

Abdelrahman Zayed

PhD - Polytechnique Montréal

Xutong Zhao

PhD - Polytechnique Montréal

Artem Zholus

PhD - Polytechnique Montréal

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

October 1, 2024

How Do We Explain AI and Ensure the Explanation Is True? Faithfulness Measurable Models Tell You How

Andrea Madsen

Siva Reddy

Sarath Chandar

Read the article

Publications

A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques

Megh Thakkar

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

Large language models are first pre-trained on trillions of tokens and then instruction-tuned or aligned to specific preferences. While pre-… (see more)training remains out of reach for most researchers due to the compute required, fine-tuning has become affordable thanks to parameter-efficient methods such as LoRA and QLoRA. Alignment is known to be sensitive to the many factors involved, including the quantity and quality of data, the alignment method, and the adapter rank. However, there has not yet been an extensive study of their effect on downstream performance. To address this gap, we conduct an in-depth investigation of the impact of popular choices for three crucial axes: (i) the alignment dataset (HH-RLHF and BeaverTails), (ii) the alignment technique (SFT and DPO), and (iii) the model (LLaMA-1, Vicuna-v1.3, Mistral-7b, and Mistral-7b-Instruct). Our extensive setup spanning over 300 experiments reveals consistent trends and unexpected findings. We observe how more informative data helps with preference alignment, cases where supervised fine-tuning outperforms preference optimization, and how aligning to a distinct preference boosts performance on downstream tasks. Through our in-depth analyses, we put forward key guidelines to help researchers perform more effective parameter-efficient LLM alignment.

2024-06-07

ArXiv (preprint)

A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques

Megh Thakkar

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

2024-06-07

ArXiv (preprint)

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning

Artem Zholus

Maksim Kuznetsov

Roman Schutski

Shayakhmetov Rim

Daniil Polykovskiy

Alex Zhavoronkov

Generating novel active molecules for a given protein is an extremely challenging task for generative models that requires an understanding … (see more)of the complex physical interactions between the molecule and its environment. In this paper, we present a novel generative model, BindGPT which uses a conceptually simple but powerful approach to create 3D molecules within the protein's binding site. Our model produces molecular graphs and conformations jointly, eliminating the need for an extra graph reconstruction step. We pretrain BindGPT on a large-scale dataset and fine-tune it with reinforcement learning using scores from external simulation software. We demonstrate how a single pretrained language model can serve at the same time as a 3D molecular generative model, conformer generator conditioned on the molecular graph, and a pocket-conditioned 3D molecule generator. Notably, the model does not make any representational equivariance assumptions about the domain of generation. We show how such simple conceptual approach combined with pretraining and scaling can perform on par or better than the current best specialized diffusion models, language models, and graph neural networks while being two orders of magnitude cheaper to sample.

2024-06-06

ArXiv (preprint)

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning

Artem Zholus

Maksim Kuznetsov

Roman Schutski

Shayakhmetov Rim

Daniil Polykovskiy

Alex Zhavoronkov

2024-06-06

ArXiv (preprint)

A responsible framework for applying artificial intelligence on medical images and signals at the point-of-care: the PACS-AI platform.

Pascal Thériault-Lauzier

Denis Cobin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

J. W. Gichoya

Pierre-Luc Deziel

Julie Hussin

Samuel Kadoury

Robert Avram

2024-06-01

Canadian Journal of Cardiology (published)

A responsible framework for applying artificial intelligence on medical images and signals at the point-of-care: the PACS-AI platform.

Pascal Thériault-Lauzier

Denis Cobin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

J. W. Gichoya

Pierre-Luc Deziel

Julie Hussin

Samuel Kadoury

Robert Avram

2024-06-01

Canadian Journal of Cardiology (published)

A responsible framework for applying artificial intelligence on medical images and signals at the point-of-care: the PACS-AI platform.

Pascal Thériault-Lauzier

Denis Cobin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

Judy Wawira Gichoya

Pierre-Luc Deziel

Julie Hussin

Samuel Kadoury

Robert Avram

2024-06-01

Canadian Journal of Cardiology (published)

On the Costs and Benefits of Adopting Lifelong Learning for Software Analytics -- Empirical Study on Brown Build and Risk Prediction

Doriane Olewicki

Sarra Habchi

Mathieu Nayrolles

Mojtaba Faramarzi

Bram Adams

Nowadays, software analytics tools using machine learning (ML) models to, for example, predict the risk of a code change are well establishe… (see more)d. However, as the goals of a project shift over time, and developers and their habits change, the performance of said models tends to degrade (drift) over time. Current retraining practices typically require retraining a new model from scratch on a large updated dataset when performance decay is observed, thus incurring a computational cost; also there is no continuity between the models as the past model is discarded and ignored during the new model training. Even though the literature has taken interest in online learning approaches, those have rarely been integrated and evaluated in industrial environments. This paper evaluates the use of lifelong learning (LL) for industrial use cases at Ubisoft, evaluating both the performance and the required computational effort in comparison to the retraining-from-scratch approaches commonly used by the industry. LL is used to continuously build and maintain ML-based software analytics tools using an incremental learner that progressively updates the old model using new data. To avoid so-called"catastrophic forgetting"of important older data points, we adopt a replay buffer of older data, which still allows us to drastically reduce the size of the overall training dataset, and hence model training time.

2024-05-31

Proceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice (published)

Manifold Metric: A Loss Landscape Approach for Predicting Model Performance

Determining the optimal model for a given task often requires training multiple models from scratch, which becomes impractical as dataset an… (see more)d model sizes grow. A more efficient alternative is to expand smaller pre-trained models, but this approach is underutilized due to a limited understanding of its impact on the training dynamics. Existing methods for quantifying this impact have notable limitations, including computation cost. To address this, we introduce a new perspective based on the loss landscape, which has been shown to contain a manifold of linearly connected minima. Specifically, we propose a metric that estimates the size of this manifold to study the impact of model expansion. Our experiments reveal a strong correlation between performance gains and our manifold metric, enabling more informed model comparison and offering a first step toward a geometry-driven approach for reliable model expansion. Notably, our metric outperforms other baselines, even when different types of expansion with equivalent number of parameters are applied to a model.

2024-05-24

ArXiv (preprint)

Predicting the Impact of Model Expansion through the Minima Manifold: A Loss Landscape Perspective

The optimal model for a given task is often challenging to determine, requiring training multiple models from scratch which becomes prohibit… (see more)ive as dataset and model sizes grow. A more efficient alternative is to reuse smaller pre-trained models by expanding them, however, this is not widely adopted as how this impacts training dynamics remains poorly understood. While prior works have introduced statistics to measure these effects, they remain flawed. To rectify this, we offer a new approach for understanding and quantifying the impact of expansion through the lens of the loss landscape, which has been shown to contain a manifold of linearly connected minima. Building on this new perspective, we propose a metric to study the impact of expansion by estimating the size of the manifold. Experimental results show a clear relationship between gains in performance and manifold size, enabling the comparison of candidate models and presenting a first step towards expanding models more reliably based on geometric properties of the loss landscape.

2024-05-24

ArXiv (preprint)

Interpretability Needs a New Paradigm

Andreas Madsen

Himabindu Lakkaraju

Siva Reddy

2024-05-08

ArXiv (preprint)

Sub-goal Distillation: A Method to Improve Small Language Agents

Maryam Hashemzadeh

Elias Stengel-Eskin

Marc-Alexandre Côté

While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational req… (see more)uirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks. To address these constraints, we propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Our approach involves constructing a hierarchical agent comprising a planning module, which learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module, which learns to accomplish these sub-goals using elementary actions. In detail, we leverage an LLM to annotate an oracle path with a sequence of sub-goals towards completing a goal. Subsequently, we utilize this annotated data to fine-tune both the planning and execution modules. Importantly, neither module relies on real-time access to an LLM during inference, significantly reducing the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging and multi-task interactive text environment, our method surpasses standard imitation learning based solely on elementary actions by 16.7% (absolute). Our analysis highlights the efficiency of our approach compared to other LLM-based methods. Our code and annotated data for distillation can be found on GitHub.

2024-05-04

ArXiv (preprint)