Sarath Chandar

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Ista Abbes

Master's Research - Université de Montréal

Davide Baldelli

PhD - Polytechnique Montréal

Co-supervisor :

Master's Research - Polytechnique Montréal

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Julien Cohen-Adad

emvnagakarthik@gmail.com

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

Collaborating researcher - Université de Montréal

Principal supervisor :

Liam Paull

Maryam Hashemzadeh

PhD - Université de Montréal

David Heurtel--Depeiges

PhD - Polytechnique Montréal

Amir Ardalan Kalantari Dehaghi

Jerry Huang

PhD - Université de Montréal

Collaborating Alumni

Lola Le Breton

Master's Research - Polytechnique Montréal

Ekaterina Lobacheva

Postdoctorate - Université de Montréal

PhD - Polytechnique Montréal

Mohamed Amine Merzouk

Postdoctorate - Polytechnique Montréal

Principal supervisor :

Hadi NekoeiQachkanloo

PhD - Université de Montréal

Darshan Patil

PhD - Université de Montréal

Gabriele Prato

PhD - Université de Montréal

Postdoctorate

Independent visiting researcher

Mohammad R. Samsami

Master's Research - Université de Montréal

Master's Research - Polytechnique Montréal

Arjun Vaithilingam Sudhakar

Megh Thakkar

Master's Research - Université de Montréal

PhD - Polytechnique Montréal

Abdelrahman Zayed

PhD - Polytechnique Montréal

Xutong Zhao

PhD - Polytechnique Montréal

Artem Zholus

PhD - Polytechnique Montréal

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

October 1, 2024

How Do We Explain AI and Ensure the Explanation Is True? Faithfulness Measurable Models Tell You How

Andrea Madsen

Siva Reddy

Sarath Chandar

Read the article

Publications

Monitoring morphometric drift in lifelong learning segmentation of the spinal cord

Enamundram Naga Karthik

Sandrine B'edard

Jan Valovsek

Christoph Aigner

Elise Bannier

Josef Bednavr'ik

Virginie Callot

Anna Combes

Armin Curt

Gergely David

Falk Eippert

Lynn Farner

M. G. Fehlings

Patrick Freund

Tobias Granberg

Cristina Granziera

Rhscir Network Imaging Group

Ulrike Horn

Tom'avs Hor'ak

Suzanne Humphreys … (see 36 more)

Markus Hupp

Anne Kerbrat

Nawal Kinany

Shannon Kolind

Petr Kudlivcka

Anna Lebret

Lisa Eunyoung Lee

Caterina Mainero

Allan R. Martin

Megan McGrath

Govind Nair

Kristin P. O’Grady

Jiwon Oh

Russell Ouellette

Nikolai Pfender

Dario Pfyffer

P. Pradat

Alexandre Prat

Emanuele Pravatà

D. S. Reich

Ilaria Ricchi

Naama Rotem-Kohavi

Simon Schading-Sassenhausen

Maryam Seif

Andrew C. Smith

Seth Aaron Smith

Grace Sweeney

Roger Tam

Anthony Traboulsee

Constantina A. Treaba

Charidimos Tsagkas

Zachary Vavasour

Dimitri Van De Ville

Kenneth A. Weber

Julien Cohen-Adad

2025-05-02

ArXiv (preprint)

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning

Artem Zholus

Maksim Kuznetsov

Roman Schutski

Shayakhmetov Rim

Daniil Polykovskiy

Alex Zhavoronkov

Generating novel active molecules for a given protein is an extremely challenging task for generative models that requires an understanding … (see more)of the complex physical interactions between the molecule and its environment. In this paper, we present a novel generative model, BindGPT which uses a conceptually simple but powerful approach to create 3D molecules within the protein's binding site. Our model produces molecular graphs and conformations jointly, eliminating the need for an extra graph reconstruction step. We pretrain BindGPT on a large-scale dataset and fine-tune it with reinforcement learning using scores from external simulation software. We demonstrate how a single pretrained language model can serve at the same time as a 3D molecular generative model, conformer generator conditioned on the molecular graph, and a pocket-conditioned 3D molecule generator. Notably, the model does not make any representational equivariance assumptions about the domain of generation. We show how such simple conceptual approach combined with pretraining and scaling can perform on par or better than the current best specialized diffusion models, language models, and graph neural networks while being two orders of magnitude cheaper to sample.

2025-04-11

Proceedings of the AAAI Conference on Artificial Intelligence (published)

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

Artem Zholus

Carl Doersch

Yi Yang

Skanda Koppula

Viorica Patraucean

Xu Owen He

Ignacio Rocco

Mehdi S. M. Sajjadi

Ross Goroshin

2025-04-08

ArXiv (preprint)

CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning

Prashant Govindarajan

Mathieu Reymond

Antoine Clavaud

Mariano Phielipp

Santiago Miret

*In silico* design and optimization of new materials primarily relies on high-accuracy atomic simulators that perform density functional the… (see more)ory (DFT) calculations. While recent works showcase the strong potential of machine learning to accelerate the material design process, they mostly consist of generative approaches that do not use direct DFT signals as feedback to improve training and generation mainly due to DFT's high computational cost. To aid the adoption of direct DFT signals in the materials design loop through online reinforcement learning (RL), we propose **CrystalGym**, an open-source RL environment for crystalline material discovery. Using CrystalGym, we benchmark value- and policy-based reinforcement learning algorithms for designing various crystals conditioned on target properties. Concretely, we optimize for challenging properties like the band gap, bulk modulus, and density, which are directly calculated from DFT in the environment. While none of the algorithms we benchmark solve all CrystalGym tasks, our extensive experiments and ablations show different sample efficiencies and ease of convergence to optimality for different algorithms and environment settings. Our goal is for CrystalGym to serve as a test bed for reinforcement learning researchers and material scientists to address these real-world design problems with practical applications. Furthermore, we introduce a novel class of challenges for reinforcement learning methods dealing with time-consuming reward signals, paving the way for future interdisciplinary research for machine learning motivated by real-world applications.

2025-03-03

ICLR.cc/2025/Workshop/AI4MAT (spotlight)

openreview.net

Steering Large Language Model Activations in Sparse Spaces

Reza Bayat

Ali Rahimi-Kalahroudi

Mohammad Pezeshki

Pascal Vincent

A key challenge in AI alignment is guiding large language models (LLMs) to follow desired behaviors at test time. Activation steering, which… (see more) modifies internal model activations during inference, offers a potential solution. However, prior work in dense activation spaces struggles with superposition, wherein multiple features become entangled, limiting interpretability and precise control. In contrast, sparse representations provide an untapped opportunity for more interpretable behavior modulation. In this work, we introduce sparse activation steering (SAS), a method that leverages sparse autoencoders (SAEs) to steer LLM behavior in sparse spaces. By isolating behavior-specific features through a contrastive prompt-pairing approach, we define a set of features that can selectively reinforce or suppress behaviors. Experiments on Gemma 2 LLMs show that SAS vectors enable nuanced behavioral modulation and finer-grained control. Furthermore, scaling SAEs improves monosemanticity of SAS vectors, suggesting more reliable and interpretable interventions.

2025-02-28

ArXiv (preprint)

Steering Large Language Model Activations in Sparse Spaces

Reza Bayat

Ali Rahimi-Kalahroudi

Mohammad Pezeshki

Pascal Vincent

2025-02-28

ArXiv (preprint)

NeoBERT: A Next-Generation BERT

Lola Le Breton

Quentin Fournier

Mariam El Mezouar

Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of … (see more)large auto-regressive language models such as LLaMA and DeepSeek. In contrast, encoders like BERT and RoBERTa have not seen the same level of progress despite being foundational for many downstream NLP applications. To bridge this gap, we introduce NeoBERT, a next-generation encoder that redefines the capabilities of bidirectional models by integrating state-of-the-art advancements in architecture, modern data, and optimized pre-training methodologies. NeoBERT is designed for seamless adoption: it serves as a plug-and-play replacement for existing base models, relies on an optimal depth-to-width ratio, and leverages an extended context length of 4,096 tokens. Despite its compact 250M parameter footprint, it achieves state-of-the-art results on the massive MTEB benchmark, outperforming BERT large, RoBERTa large, NomicBERT, and ModernBERT under identical fine-tuning conditions. In addition, we rigorously evaluate the impact of each modification on GLUE and design a uniform fine-tuning and evaluation framework for MTEB. We release all code, data, checkpoints, and training scripts to accelerate research and real-world adoption.

2025-02-26

ArXiv (preprint)

NeoBERT: A Next-Generation BERT

Lola Le Breton

Quentin Fournier

Mariam El Mezouar

2025-02-26

ArXiv (preprint)

Sub-goal Distillation: A Method to Improve Small Language Agents

Maryam Hashemzadeh

Elias Stengel-Eskin

Marc-Alexandre Côté

While Large Language Models (LLMs) have demonstrated significant promise as agents in interactive tasks, their substantial computational req… (see more)uirements and restricted number of calls constrain their practical utility, especially in long-horizon interactive tasks such as decision-making or in scenarios involving continuous ongoing tasks. To address these constraints, we propose a method for transferring the performance of an LLM with billions of parameters to a much smaller language model (770M parameters). Our approach involves constructing a hierarchical agent comprising a planning module, which learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module, which learns to accomplish these sub-goals using elementary actions. In detail, we leverage an LLM to annotate an oracle path with a sequence of sub-goals towards completing a goal. Subsequently, we utilize this annotated data to fine-tune both the planning and execution modules. Importantly, neither module relies on real-time access to an LLM during inference, significantly reducing the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging and multi-task interactive text environment, our method surpasses standard imitation learning based solely on elementary actions by 16.7% (absolute). Our analysis highlights the efficiency of our approach compared to other LLM-based methods. Our code and annotated data for distillation can be found on GitHub.

2025-02-17

Proceedings of The 3rd Conference on Lifelong Learning Agents (published)

A Generalist Hanabi Agent

Arjun V Sudhakar

Hadi Nekoei

Mathieu Reymond

Miao Liu

Janarthanan Rajendran

Gintare Karolina Dziugaite

Traditional multi-agent reinforcement learning (MARL) systems can develop cooperative strategies through repeated interactions. However, the… (see more)se systems are unable to perform well on any other setting than the one they have been trained on, and struggle to successfully cooperate with unfamiliar collaborators. This is particularly visible in the Hanabi benchmark, a popular 2-to-5 player cooperative card-game which requires complex reasoning and precise assistance to other agents. Current MARL agents for Hanabi can only learn one specific game-setting (e.g., 2-player games), and play with the same algorithmic agents. This is in stark contrast to humans, who can quickly adjust their strategies to work with unfamiliar partners or situations. In this paper, we introduce Recurrent Replay Relevance Distributed DQN (R3D2), a generalist agent for Hanabi, designed to overcome these limitations. We reformulate the task using text, as language has been shown to improve transfer. We then propose a distributed MARL algorithm that copes with the resulting dynamic observation- and action-space. In doing so, our agent is the first that can play all game settings concurrently, and extend strategies learned from one setting to other ones. As a consequence, our agent also demonstrates the ability to collaborate with different algorithmic agents ---agents that are themselves unable to do so.

2025-01-22

ICLR.cc/2025/Conference (poster)

openreview.net

Torque-Aware Momentum

Pranshu Malviya

Goncalo Mordido

Aristide Baratin

Reza Babanezhad Harikandeh

Razvan Pascanu

Efficiently exploring complex loss landscapes is key to the performance of deep neural networks. While momentum-based optimizers are widely … (see more)used in state-of-the-art setups, classical momentum can still struggle with large, misaligned gradients, leading to oscillations. To address this, we propose Torque-Aware Momentum (TAM), which introduces a damping factor based on the angle between the new gradients and previous momentum, stabilizing the update direction during training. Empirical results show that TAM, which can be combined with both SGD and Adam, enhances exploration, handles distribution shifts more effectively, and improves generalization performance across various tasks, including image classification and large language model fine-tuning, when compared to classical momentum-based optimizers.

2024-12-25

ArXiv (preprint)

Gintare Karolina Dziugaite

Torque-Aware Momentum

Pranshu Malviya

Goncalo Mordido

Aristide Baratin

Reza Babanezhad Harikandeh

Razvan Pascanu

2024-12-25

ArXiv (preprint)