Sarath Chandar

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Ista Abbes

Master's Research - Université de Montréal

Alex Aselstyne

Research Intern - Polytechnique Montréal

Davide Baldelli

PhD - Polytechnique Montréal

Co-supervisor :

joe Ben

Research Intern - Polytechnique Montréal

joumenbensaid@gmail.com

Antoine Clavaud

Master's Research - Polytechnique Montréal

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Julien Cohen-Adad

emvnagakarthik@gmail.com

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher - Université de Montréal

Principal supervisor :

Liam Paull

Maryam Hashemzadeh

PhD - Université de Montréal

David Heurtel--Depeiges

PhD - Polytechnique Montréal

Amir Ardalan Kalantari Dehaghi

Jerry Huang

PhD - Université de Montréal

Collaborating Alumni

Lola Le Breton

Master's Research - Polytechnique Montréal

Ekaterina Lobacheva

Postdoctorate - Université de Montréal

PhD - Polytechnique Montréal

Roshan Munirathinam Sankaran Balaji

Mohamed Amine Merzouk

Postdoctorate - Polytechnique Montréal

Principal supervisor :

Research Intern - Polytechnique Montréal

Hadi NekoeiQachkanloo

PhD - Université de Montréal

Darshan Patil

PhD - Université de Montréal

Gabriele Prato

PhD - Université de Montréal

Postdoctorate

Independent visiting researcher

Mohammad R. Samsami

Master's Research - Université de Montréal

Master's Research - Polytechnique Montréal

Arjun Vaithilingam Sudhakar

Megh Thakkar

Master's Research - Université de Montréal

PhD - Polytechnique Montréal

Kowen Woo

Research Intern - Polytechnique Montréal

Abdelrahman Zayed

PhD - Polytechnique Montréal

Xutong Zhao

PhD - Polytechnique Montréal

Artem Zholus

PhD - Polytechnique Montréal

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

October 1, 2024

How Do We Explain AI and Ensure the Explanation Is True? Faithfulness Measurable Models Tell You How

Andrea Madsen

Siva Reddy

Sarath Chandar

Read the article

Publications

Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar

Yash More

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

There is a growing interest in training domain-expert LLMs that excel in specific technical fields compared to their general-purpose instruc… (see more)tion-tuned counterparts. However, these expert models often experience a loss in their safety abilities in the process, making them capable of generating harmful content. As a solution, we introduce an efficient and effective merging-based alignment method called \textsc{MergeAlign} that interpolates the domain and alignment vectors, creating safer domain-specific models while preserving their utility. We apply \textsc{MergeAlign} on Llama3 variants that are experts in medicine and finance, obtaining substantial alignment improvements with minimal to no degradation on domain-specific benchmarks. We study the impact of model merging through model similarity metrics and contributions of individual models being merged. We hope our findings open new research avenues and inspire more efficient development of safe expert LLMs.

2024-11-11

ArXiv (preprint)

Crystal Design Amidst Noisy DFT Signals: A Reinforcement Learning Approach

Prashant Govindarajan

Mathieu Reymond

Santiago Miret

Mariano Phielipp

2024-11-03

NeurIPS.cc/2024/Workshop/AI4Mat (published)

openreview.net

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination

Jerry Huang

Prasanna Parthasarathi

Mehdi Rezagholizadeh

Boxing Chen

The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some … (see more)of this is also owed to the risks and costs associated with their use. On one front is their tendency to \textit{hallucinate} false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.

2024-10-22

ArXiv (preprint)

Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar

Yash More

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

Chandar Research Lab

Mila - Québec

AI Institute

Ibm Research

Polytechnique Montréal

2024-10-10

NeurIPS.cc/2024/Workshop/AFM (poster)

openreview.net

Toward Debugging Deep Reinforcement Learning Programs with RLExplorer

Rached Bouchoucha

Ahmed Haj Yahmed

Darshan Patil

Janarthanan Rajendran

Amin Nikanjam

Foutse Khomh

Deep reinforcement learning (DRL) has shown success in diverse domains such as robotics, computer games, and recommendation systems. However… (see more), like any other software system, DRL-based software systems are susceptible to faults that pose unique challenges for debugging and diagnosing. These faults often result in unexpected behavior without explicit failures and error messages, making debugging difficult and time-consuming. Therefore, automating the monitoring and diagnosis of DRL systems is crucial to alleviate the burden on developers. In this paper, we propose RLExplorer, the first fault diagnosis approach for DRL-based software systems. RLExplorer automatically monitors training traces and runs diagnosis routines based on properties of the DRL learning dynamics to detect the occurrence of DRL-specific faults. It then logs the results of these diagnoses as warnings that cover theoretical concepts, recommended practices, and potential solutions to the identified faults. We conducted two sets of evaluations to assess RLExplorer. Our first evaluation of faulty DRL samples from Stack Overflow revealed that our approach can effectively diagnose real faults in 83% of the cases. Our second evaluation of RLExplorer with 15 DRL experts/developers showed that (1) RLExplorer could identify 3.6 times more defects than manual debugging and (2) RLExplorer is easily integrated into DRL applications.

2024-10-06

ArXiv (preprint)

Toward Debugging Deep Reinforcement Learning Programs with RLExplorer

Rached Bouchoucha

Ahmed Haj Yahmed

Darshan Patil

Janarthanan Rajendran

Amin Nikanjam

Foutse Khomh

2024-10-06

2024 IEEE International Conference on Software Maintenance and Evolution (ICSME) (published)

Balancing Context Length and Mixing Times for Reinforcement Learning at Scale

Matthew D Riemer

Khimya Khetarpal

Janarthanan Rajendran

Mila Janarthanan

É. Montréal

2024-09-25

NeurIPS.cc/2024/Conference (poster)

openreview.net

Protein Language Models: Is Scaling Necessary?

Quentin Fournier

Robert M. Vernon

Almer van der Sloot

Benjamin Schulz

Christopher James Langmead

2024-09-23

bioRxiv (preprint)

Protein Language Models: Is Scaling Necessary?

Quentin Fournier

Robert M. Vernon

Almer van der Sloot

Benjamin Schulz

Christopher James Langmead

Public protein sequence databases contain samples from the fitness landscape explored by nature. Protein language models (pLMs) pre-trained … (see more)on these sequences aim to capture this landscape for tasks like property prediction and protein design. Following the same trend as in natural language processing, pLMs have continuously been scaled up. However, the premise that scale leads to better performance assumes that source databases provide accurate representation of the underlying fitness landscape, which is likely false. By developing an efficient codebase, designing a modern architecture, and addressing data quality concerns such as sample bias, we introduce AMPLIFY, a best-in-class pLM that is orders of magnitude less expensive to train and deploy than previous models. Furthermore, to support the scientific community and democratize the training of pLMs, we have open-sourced AMPLIFY’s pre-training codebase, data, and model checkpoints.

2024-09-23

bioRxiv (preprint)

Are self-explanations from Large Language Models faithful?

Andreas Madsen

Siva Reddy

2024-08-01

Findings of the Association for Computational Linguistics ACL 2024 (published)