Portrait of Sarath Chandar

Sarath Chandar

Core Academic Member
Canada CIFAR AI Chair
Associate Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Indian Institute of Technology Madras
Research Topics
AI Alignment
Deep Learning
Explainable AI (XAI)
Foundation Models
Interpretability
Large Language Models (LLM)
Lifelong Learning
Medical Machine Learning
Multi-Agent Systems
Natural Language Processing
Online Learning
Optimization
Recurrent Neural Networks
Reinforcement Learning
Representation Learning
Transfer Learning
Trustworthy AI

Biography

Sarath Chandar is an associate professor at Polytechnique Montreal's Department of Computer and Software Engineering, where he leads the Chandar Research Lab. He is also a Core Academic Member at Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Master's Research - Université de Montréal
PhD - Polytechnique Montréal
Co-supervisor :
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal
Principal supervisor :
PhD - Polytechnique Montréal
PhD - Université de Montréal
Principal supervisor :
Collaborating researcher - Université de Montréal
Principal supervisor :
PhD - Université de Montréal
PhD - Polytechnique Montréal
PhD - Université de Montréal
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal
Co-supervisor :
PhD - Polytechnique Montréal
Master's Research - Polytechnique Montréal
Postdoctorate - Polytechnique Montréal
Principal supervisor :
PhD - Université de Montréal
PhD - Université de Montréal
PhD - Université de Montréal
Collaborating Alumni - Université de Montréal
Co-supervisor :
Independent visiting researcher
Master's Research - Université de Montréal
Master's Research - Université de Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal
PhD - Polytechnique Montréal

Publications

Context-Aware Assistant Selection for Improved Inference Acceleration with Large Language Models
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Exploring Quantization for Efficient Pre-Training of Transformer Language Models
Kamran Chitsaz
Quentin Fournier
Goncalo Mordido
The increasing scale of Transformer models has led to an increase in their pre-training computational requirements. While quantization has p… (see more)roven to be effective after pre-training and during fine-tuning, applying quantization in Transformers during pre-training has remained largely unexplored at scale for language modeling. This study aims to explore the impact of quantization for efficient pre-training of Transformers, with a focus on linear layer components. By systematically applying straightforward linear quantization to weights, activations, gradients, and optimizer states, we assess its effects on model efficiency, stability, and performance during training. By offering a comprehensive recipe of effective quantization strategies to be applied during the pre-training of Transformers, we promote high training efficiency from scratch while retaining language modeling ability. Code is available at https://github.com/chandar-lab/EfficientLLMs.
Do Large Language Models Know How Much They Know?
Gabriele Prato
Jerry Huang
Prasanna Parthasarathi
Shagun Sodhani
Do Large Language Models Know How Much They Know?
Gabriele Prato
Jerry Huang
Prasanna Parthasarathi
Shagun Sodhani
Large Language Models (LLMs) have emerged as highly capable systems and are increasingly being integrated into various uses. Nevertheless, t… (see more)he rapid advancement in their deployment trails a comprehensive understanding of their internal mechanisms, as well as a delineation of their capabilities and limitations. A desired characteristic of an intelligent system is its ability to recognize the scope of its own knowledge. To investigate whether LLMs embody this attribute, we develop a benchmark that challenges these models to enumerate all information they possess on specific topics. This benchmark assesses whether the models recall excessive, insufficient, or the precise amount of required information, thereby indicating their awareness of how much they know about the given topic. Our findings reveal that the emergence of this property varies across different architectures and manifests at diverse rates. However, with sufficient scaling, all tested models are ultimately capable of performing this task. The insights gained from this research advance our understanding of LLMs, shedding light on their operational capabilities and contributing to the ongoing exploration of their intricate dynamics.
Learning Conditional Policies for Crystal Design Using Offline Reinforcement Learning
Prashant Govindarajan
Santiago Miret
Jarrid Rector-Brooks
Mariano Phielipp
Janarthanan Rajendran
Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material dis… (see more)covery. Recent developments in generative and geometric deep learning have shown...
Learning Conditional Policies for Crystal Design Using Offline Reinforcement Learning
Prashant Govindarajan
Santiago Miret
Jarrid Rector-Brooks
Mariano Phielipp
Janarthanan Rajendran
Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material dis… (see more)covery. Recent developments in generative and geometric deep learning have shown...
MVP: Minimal Viable Phrase for Long Text Understanding.
Louis Clouâtre
Fairness-Aware Structured Pruning in Transformers
Abdelrahman Zayed
Goncalo Mordido
Samira Shabanian
Ioana Baldini
The increasing size of large language models (LLMs) has introduced challenges in their training and inference. Removing model components is … (see more)perceived as a solution to tackle the large model sizes, however, existing pruning methods solely focus on performance, without considering an essential aspect for the responsible use of LLMs: model fairness. It is crucial to address the fairness of LLMs towards diverse groups, such as women, Black people, LGBTQ+, Jewish communities, among others, as they are being deployed and available to a wide audience. In this work, first, we investigate how attention heads impact fairness and performance in pre-trained transformer-based language models. We then propose a novel method to prune the attention heads that negatively impact fairness while retaining the heads critical for performance, i.e. language modeling capabilities. Our approach is practical in terms of time and resources, as it does not require fine-tuning the final pruned, and fairer, model. Our findings demonstrate a reduction in gender bias by 19%, 19.5%, 39.5%, 34.7%, 23%, and 8% for DistilGPT-2, GPT-2, GPT-Neo of two different sizes, GPT-J, and Llama 2 models, respectively, in comparison to the biased model, with only a slight decrease in performance. WARNING: This work uses language that is offensive in nature.
Fairness-Aware Structured Pruning in Transformers
Abdelrahman Zayed
Goncalo Mordido
Samira Shabanian
Ioana Baldini
Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models
Amirhossein Kazemnejad
Mehdi Rezagholizadeh
Prasanna Parthasarathi
Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning
Hadi Nekoei
Akilesh Badrinaaraayanan
Amit Sinha
Mohammad Amin Amini
Janarthanan Rajendran
Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi
Hadi Nekoei
Xutong Zhao
Janarthanan Rajendran
Miao Liu