Sarath Chandar

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Ista Abbes

Master's Research - Université de Montréal

Alex Aselstyne

Master's Research - Polytechnique Montréal

Davide Baldelli

PhD - Polytechnique Montréal

Co-supervisor :

joe Ben

Research Intern - Polytechnique Montréal

joumenbensaid@gmail.com

Milan Bhan

Collaborating researcher

Diego Cerda Mardini

Master's Research - McGill University

Antoine Clavaud

Master's Research - Polytechnique Montréal

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Julien Cohen-Adad

emvnagakarthik@gmail.com

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

David Heurtel--Depeiges

PhD - Polytechnique Montréal

Jerry Huang

PhD - Université de Montréal

Saurav Jha

Postdoctorate - Polytechnique Montréal

Amir Ardalan Kalantari Dehaghi

Collaborating Alumni

Lola Le Breton

PhD - Polytechnique Montréal

Aidan Li

Master's Research - Université de Montréal

Co-supervisor :

Postdoctorate - Université de Montréal

PhD - Polytechnique Montréal

Roshan Munirathinam Sankaran Balaji

Mohamed Amine Merzouk

Postdoctorate - Polytechnique Montréal

Principal supervisor :

Research Intern - Polytechnique Montréal

Rayen Nacef

Research Intern - Polytechnique Montréal

Hadi NekoeiQachkanloo

PhD - Université de Montréal

Nilaksh Nilaksh

PhD - Polytechnique Montréal

PhD - Université de Montréal

Linda Peinthiere

Collaborating researcher - Polytechnique Montréal Montreal

uphro_1@hotmail.com

Gabriele Prato

PhD - Université de Montréal

Postdoctorate

Independent visiting researcher

Mohammad R. Samsami

Master's Research - Université de Montréal

Shaipranesh Senthilkumar

PhD - Polytechnique Montréal

shaipraneshgci@gmail.com

Nour Shaheen

Master's Research - Polytechnique Montréal

Arjun Vaithilingam Sudhakar

Achille Sowa Samo

PhD - Polytechnique Montréal

Megh Thakkar

Master's Research - Université de Montréal

PhD - Polytechnique Montréal

Shawn Whitfield

Collaborating researcher

Kowen Woo

Research Intern - Polytechnique Montréal

Anabel XL

Postdoctorate - Université de Montréal

Abdelrahman Zayed

PhD - Polytechnique Montréal

PhD - Polytechnique Montréal

Artem Zholus

PhD - Polytechnique Montréal

NeoBERT: A New Frontier for Open-Source Encoder Language Models

Blog Posts

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

March 3, 2025

Lola Le Breton

Quentin Fournier

Sarath Chandar

Read the article

October 1, 2024

How Do We Explain AI and Ensure the Explanation Is True? Faithfulness Measurable Models Tell You How

Andrea Madsen

Siva Reddy

Sarath Chandar

Read the article

Publications

Sharpness-Aware Minimization Scaled by Outlier Normalization for Robust DNNs on In-Memory Computing Accelerators

Sébastien Henwood

Goncalo Mordido

Yvon Savaria

François Leduc-Primeau

Many deep neural network (DNN) models consume a significant amount of energy at inference time, in large part due to energy consumed by memo… (see more)ry access. In-memory computing addresses this problem by eliminating many memory accesses, but exposes model weights to noise and circuit variations. While several methods have been proposed to train DNNs robust to weight noise they typically require knowledge of the noise distribution, or degrade the DNN performance in noiseless setting. In this work, we first show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy weights at inference time. Then, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier normalization (SAMSON). Results on computer-vision benchmarks show that SAMSON increases model robustness to noisy weights without compromising generalization performance in noiseless regimes.

2024-01-01

IEEECONF (published)

Fairness-Aware Structured Pruning in Transformers

Abdelrahman Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

2023-12-24

ArXiv (preprint)

Fairness-Aware Structured Pruning in Transformers

Abdelrahman Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

The increasing size of large language models (LLMs) has introduced challenges in their training and inference. Removing model components is … (see more)perceived as a solution to tackle the large model sizes, however, existing pruning methods solely focus on performance, without considering an essential aspect for the responsible use of LLMs: model fairness. It is crucial to address the fairness of LLMs towards diverse groups, such as women, Black people, LGBTQ+, Jewish communities, among others, as they are being deployed and available to a wide audience. In this work, first, we investigate how attention heads impact fairness and performance in pre-trained transformer-based language models. We then propose a novel method to prune the attention heads that negatively impact fairness while retaining the heads critical for performance, i.e. language modeling capabilities. Our approach is practical in terms of time and resources, as it does not require fine-tuning the final pruned, and fairer, model. Our findings demonstrate a reduction in gender bias by 19%, 19.5%, 39.5%, 34.7%, 23%, and 8% for DistilGPT-2, GPT-2, GPT-Neo of two different sizes, GPT-J, and Llama 2 models, respectively, in comparison to the biased model, with only a slight decrease in performance. WARNING: This work uses language that is offensive in nature.

2023-12-24

ArXiv (preprint)

Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models

Amirhossein Kazemnejad

Mehdi Rezagholizadeh

Prasanna Parthasarathi

2023-12-01

Findings of the Association for Computational Linguistics: EMNLP 2023 (published)

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Amit Sinha

Mohammad Amin Amini

Janarthanan Rajendran

Aditya Mahajan

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Hadi Nekoei

Janarthanan Rajendran

Miao Liu

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (published)

Arjun Vaithilingam Sudhakar

Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games

Prasanna Parthasarathi

Janarthanan Rajendran

2023-11-13

ArXiv (preprint)

EpiK-Eval: Evaluation for Language Models as Epistemic Models

Gabriele Prato

Jerry Huang

Prasanna Parthasarathi

Shagun Sodhani

In the age of artificial intelligence, the role of large language models (LLMs) is becoming increasingly central. Despite their growing prev… (see more)alence, their capacity to consolidate knowledge from different training documents—a crucial ability in numerous applications—remains unexplored. This paper presents the first study examining the capability of LLMs to effectively combine such information within their parameter space. We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. Evaluations across various LLMs reveal significant weaknesses in this domain. We contend that these shortcomings stem from the intrinsic nature of prevailing training objectives. Consequently, we advocate for refining the approach towards knowledge consolidation, as it harbors the potential to dramatically improve their overall effectiveness and performance. The findings from this study offer insights for developing more robust and reliable LLMs. Our code and benchmark are available at https://github.com/chandar-lab/EpiK-Eval

2023-10-07

EMNLP/2023/Conference (accepted)

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Hadi Nekoei

Janarthanan Rajendran

Miao Liu

Cooperative Multi-agent Reinforcement Learning (MARL) algorithms with Zero-Shot Coordination (ZSC) have gained significant attention in rece… (see more)nt years. ZSC refers to the ability of agents to coordinate zero-shot (without additional interaction experience) with independently trained agents. While ZSC is crucial for cooperative MARL agents, it might not be possible for complex tasks and changing environments. Agents also need to adapt and improve their performance with minimal interaction with other agents. In this work, we show empirically that state-of-the-art ZSC algorithms have poor performance when paired with agents trained with different learning methods, and they require millions of interaction samples to adapt to these new partners. To investigate this issue, we formally defined a framework based on a popular cooperative multi-agent game called Hanabi to evaluate the adaptability of MARL methods. In particular, we created a diverse set of pre-trained agents and defined a new metric called adaptation regret that measures the agent's ability to efficiently adapt and improve its coordination performance when paired with some held-out pool of partners on top of its ZSC performance. After evaluating several SOTA algorithms using our framework, our experiments reveal that naive Independent Q-Learning (IQL) agents in most cases adapt as quickly as the SOTA ZSC algorithm Off-Belief Learning (OBL). This finding raises an interesting research question: How to design MARL algorithms with high ZSC performance and capability of fast adaptation to unseen partners. As a first step, we studied the role of different hyper-parameters and design choices on the adaptability of current MARL algorithms. Our experiments show that two categories of hyper-parameters controlling the training data diversity and optimization process have a significant impact on the adaptability of Hanabi agents.

2023-08-20

ArXiv (preprint)

Thompson Sampling for Improved Exploration in GFlowNets

Jarrid Rector-Brooks

Kanika Madan

Moksh J. Jain

Maksym Korablyov

Cheng-Hao Liu

Nikolay Malkin

Yoshua Bengio

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over composition… (see more)al objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over policies and samples trajectories from this posterior for training. We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)

Should We Attend More or Less? Modulating Attention for Fairness

A. Zayed

Goncalo Mordido

Samira Shabanian

2023-05-22

ArXiv (preprint)

Conditionally optimistic exploration for cooperative deep multi-agent reinforcement learning

Yangchen Pan

Chenjun Xiao

Janarthanan Rajendran

Efficient exploration is critical in cooperative deep Multi-Agent Reinforcement Learning (MARL). In this work, we propose an exploration met… (see more)hod that effectively encourages cooperative exploration based on the idea of sequential action-computation scheme. The high-level intuition is that to perform optimism-based exploration, agents would explore cooperative strategies if each agent’s optimism estimate captures a structured dependency relationship with other agents. Assuming agents compute actions following a sequential order at each environment timestep, we provide a perspective to view MARL as tree search iterations by considering agents as nodes at different depths of the search tree. Inspired by the theoretically justified tree search algorithm UCT (Upper Confidence bounds applied to Trees), we develop a method called Conditionally Optimistic Exploration (COE). COE augments each agent’s state-action value estimate with an action-conditioned optimistic bonus derived from the visitation count of the global state and joint actions of preceding agents. COE is performed during training and disabled at deployment, making it compatible with any value decomposition method for centralized training with decentralized execution. Experiments across various cooperative MARL benchmarks show that COE outperforms current state-of-the-art exploration methods on hard-exploration tasks.

2023-05-08

auai.org/UAI/2023/Conference (published)