Sarath Chandar

Biographie

Sarath Chandar est professeur associé au départment de génie informatique et génie logiciel de Polytechnique Montréal, où il dirige le laboratoire de recherche Chandar. Il est également membre académique principal à Mila – Institut québécois d’intelligence artificielle, et titulaire d'une chaire en IA Canada-CIFAR et d'une Chaire de recherche du Canada en apprentissage machine permanent.

Ses recherches portent sur l'apprentissage tout au long de la vie, l'apprentissage profond, l'optimisation, l'apprentissage par renforcement et le traitement du langage naturel. Pour promouvoir la recherche sur l'apprentissage tout au long de la vie, Sarath Chandar a créé la Conférence sur les agents d'apprentissage tout au long de la vie (CoLLAs) en 2022 et a présidé le programme en 2022 et en 2023. Il est titulaire d'un doctorat de l'Université de Montréal et d'une maîtrise en recherche de l'Indian Institute of Technology Madras.

Étudiants actuels

Ista Abbes

Maîtrise recherche - UdeM

Alex Aselstyne

Maîtrise recherche - Polytechnique

Davide Baldelli

Doctorat - Polytechnique

Co-superviseur⋅e :

Milan Bhan

Collaborateur·rice de recherche

Diego Cerda Mardini

Maîtrise recherche - McGill

Antoine Clavaud

Maîtrise recherche - Polytechnique

Naga Karthik Enamundram

Doctorat - Polytechnique

Superviseur⋅e principal⋅e :

Prashant Govindarajan

Doctorat - Polytechnique

Simon Guiroy

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

David Heurtel--Depeiges

Doctorat - Polytechnique

Jerry Huang

Doctorat - UdeM

Saurav Jha

Postdoctorat - Polytechnique

Amir Kalantari Dehaghi

Collaborateur·rice alumni

Lola Le Breton

Doctorat - Polytechnique

Aidan Li

Maîtrise recherche - UdeM

Co-superviseur⋅e :

Postdoctorat - UdeM

Doctorat - Polytechnique

Roshan Munirathinam Sankaran Balaji

Stagiaire de recherche - Polytechnique

Rayen Nacef

Collaborateur·rice de recherche - Polytechnique

Hadi NekoeiQachkanloo

Doctorat - UdeM

Nilaksh Nilaksh

Doctorat - Polytechnique

Doctorat - UdeM

Collaborateur·rice de recherche - Polytechnique Montreal

Doctorat - UdeM

Postdoctorat

Shaipranesh Senthilkumar

Doctorat - Polytechnique

Arjun Vaithilingam Sudhakar

Nour Shaheen

Maîtrise recherche - Polytechnique

Superviseur⋅e principal⋅e :

Doctorat - Polytechnique

Megh Thakkar

Maîtrise recherche - UdeM

Doctorat - Polytechnique

Shawn Whitfield

Collaborateur·rice de recherche

Kowen Woo

Stagiaire de recherche - Polytechnique

Anabel XL

Postdoctorat - UdeM

Abdelrahman Zayed

Doctorat - Polytechnique

Xutong Zhao

Doctorat - Polytechnique

Artem Zholus

Doctorat - Polytechnique

NeoBERT: une nouvelle frontière pour les modèles de langage encodeurs open-source

Billets de blogue

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

3 mars 2025

par

Lola Le Breton

Quentin Fournier

Sarath Chandar

Lire l'article

1 octobre 2024

Comment expliquer l’IA et s’assurer que cette explication est vraie? Les modèles mesurables de fidélité vous indiquent comment y parvenir

par

Andrea Madsen

Siva Reddy

Sarath Chandar

Lire l'article

Publications

Learning Conditional Policies for Crystal Design Using Offline Reinforcement Learning

Prashant Govindarajan

Santiago Miret

Jarrid Rector-Brooks

Mariano Phielipp

Navigating through the exponentially large chemical space to search for desirable materials is an extremely challenging task in material dis… (voir plus)covery. Recent developments in generative and geometric deep learning have shown...

2024-01-01

Digital Discovery (publié)

MVP: Minimal Viable Phrase for Long Text Understanding.

Louis Clouatre

Amal Zouaq

2024-01-01

International Conference on Language Resources and Evaluation (publié)

dblp.uni-trier.de

Sharpness-Aware Minimization Scaled by Outlier Normalization for Robust DNNs on In-Memory Computing Accelerators

Sébastien Henwood

Goncalo Mordido

Yvon Savaria

François Leduc-Primeau

Many deep neural network (DNN) models consume a significant amount of energy at inference time, in large part due to energy consumed by memo… (voir plus)ry access. In-memory computing addresses this problem by eliminating many memory accesses, but exposes model weights to noise and circuit variations. While several methods have been proposed to train DNNs robust to weight noise they typically require knowledge of the noise distribution, or degrade the DNN performance in noiseless setting. In this work, we first show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy weights at inference time. Then, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier normalization (SAMSON). Results on computer-vision benchmarks show that SAMSON increases model robustness to noisy weights without compromising generalization performance in noiseless regimes.

2024-01-01

IEEECONF (publié)

Fairness-Aware Structured Pruning in Transformers

Abdelrahman Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

2023-12-24

ArXiv (prépublication)

Fairness-Aware Structured Pruning in Transformers

Abdelrahman Zayed

Goncalo Mordido

Samira Shabanian

Ioana Baldini

The increasing size of large language models (LLMs) has introduced challenges in their training and inference. Removing model components is … (voir plus)perceived as a solution to tackle the large model sizes, however, existing pruning methods solely focus on performance, without considering an essential aspect for the responsible use of LLMs: model fairness. It is crucial to address the fairness of LLMs towards diverse groups, such as women, Black people, LGBTQ+, Jewish communities, among others, as they are being deployed and available to a wide audience. In this work, first, we investigate how attention heads impact fairness and performance in pre-trained transformer-based language models. We then propose a novel method to prune the attention heads that negatively impact fairness while retaining the heads critical for performance, i.e. language modeling capabilities. Our approach is practical in terms of time and resources, as it does not require fine-tuning the final pruned, and fairer, model. Our findings demonstrate a reduction in gender bias by 19%, 19.5%, 39.5%, 34.7%, 23%, and 8% for DistilGPT-2, GPT-2, GPT-Neo of two different sizes, GPT-J, and Llama 2 models, respectively, in comparison to the biased model, with only a slight decrease in performance. WARNING: This work uses language that is offensive in nature.

2023-12-24

ArXiv (prépublication)

Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models

Amirhossein Kazemnejad

Mehdi Rezagholizadeh

Prasanna Parthasarathi

2023-12-01

Findings of the Association for Computational Linguistics: EMNLP 2023 (publié)

openreview.net

Dealing With Non-stationarity in Decentralized Cooperative Multi-Agent Deep Reinforcement Learning via Multi-Timescale Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Amit Sinha

Mohammad Amin Amini

Aditya Mahajan

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (publié)

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Hadi Nekoei

Xutong Zhao

Miao Liu

2023-11-20

Proceedings of The 2nd Conference on Lifelong Learning Agents (publié)

Arjun Vaithilingam Sudhakar

Language Model-In-The-Loop: Data Optimal Approach to Learn-To-Recommend Actions in Text Games

Prasanna Parthasarathi

2023-11-13

ArXiv (prépublication)

EpiK-Eval: Evaluation for Language Models as Epistemic Models

Gabriele Prato

Jerry Huang

Prasanna Parthasarathi

Shagun Sodhani

In the age of artificial intelligence, the role of large language models (LLMs) is becoming increasingly central. Despite their growing prev… (voir plus)alence, their capacity to consolidate knowledge from different training documents—a crucial ability in numerous applications—remains unexplored. This paper presents the first study examining the capability of LLMs to effectively combine such information within their parameter space. We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. Evaluations across various LLMs reveal significant weaknesses in this domain. We contend that these shortcomings stem from the intrinsic nature of prevailing training objectives. Consequently, we advocate for refining the approach towards knowledge consolidation, as it harbors the potential to dramatically improve their overall effectiveness and performance. The findings from this study offer insights for developing more robust and reliable LLMs. Our code and benchmark are available at https://github.com/chandar-lab/EpiK-Eval

2023-10-07

EMNLP/2023/Conference (accepté)

openreview.net

Towards Few-shot Coordination: Revisiting Ad-hoc Teamplay Challenge In the Game of Hanabi

Hadi Nekoei

Xutong Zhao

Miao Liu

Cooperative Multi-agent Reinforcement Learning (MARL) algorithms with Zero-Shot Coordination (ZSC) have gained significant attention in rece… (voir plus)nt years. ZSC refers to the ability of agents to coordinate zero-shot (without additional interaction experience) with independently trained agents. While ZSC is crucial for cooperative MARL agents, it might not be possible for complex tasks and changing environments. Agents also need to adapt and improve their performance with minimal interaction with other agents. In this work, we show empirically that state-of-the-art ZSC algorithms have poor performance when paired with agents trained with different learning methods, and they require millions of interaction samples to adapt to these new partners. To investigate this issue, we formally defined a framework based on a popular cooperative multi-agent game called Hanabi to evaluate the adaptability of MARL methods. In particular, we created a diverse set of pre-trained agents and defined a new metric called adaptation regret that measures the agent's ability to efficiently adapt and improve its coordination performance when paired with some held-out pool of partners on top of its ZSC performance. After evaluating several SOTA algorithms using our framework, our experiments reveal that naive Independent Q-Learning (IQL) agents in most cases adapt as quickly as the SOTA ZSC algorithm Off-Belief Learning (OBL). This finding raises an interesting research question: How to design MARL algorithms with high ZSC performance and capability of fast adaptation to unseen partners. As a first step, we studied the role of different hyper-parameters and design choices on the adaptability of current MARL algorithms. Our experiments show that two categories of hyper-parameters controlling the training data diversity and optimization process have a significant impact on the adaptability of Hanabi agents.

2023-08-20

ArXiv (prépublication)

Thompson Sampling for Improved Exploration in GFlowNets

Moksh J. Jain

Cheng-Hao Liu

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over composition… (voir plus)al objects as a sequential decision-making problem with a learnable action policy. Unlike other algorithms for hierarchical sampling that optimize a variational bound, GFlowNet algorithms can stably run off-policy, which can be advantageous for discovering modes of the target distribution. Despite this flexibility in the choice of behaviour policy, the optimal way of efficiently selecting trajectories for training has not yet been systematically explored. In this paper, we view the choice of trajectories for training as an active learning problem and approach it using Bayesian techniques inspired by methods for multi-armed bandits. The proposed algorithm, Thompson sampling GFlowNets (TS-GFN), maintains an approximate posterior distribution over policies and samples trajectories from this posterior for training. We show in two domains that TS-GFN yields improved exploration and thus faster convergence to the target distribution than the off-policy exploration strategies used in past work.

2023-06-19

ICML.cc/2023/Workshop/SPIGM (poster)