Sarath Chandar

Biographie

Sarath Chandar est professeur associé au départment de génie informatique et génie logiciel de Polytechnique Montréal, où il dirige le laboratoire de recherche Chandar. Il est également membre académique principal à Mila – Institut québécois d’intelligence artificielle, et titulaire d'une chaire en IA Canada-CIFAR et d'une Chaire de recherche du Canada en apprentissage machine permanent.

Ses recherches portent sur l'apprentissage tout au long de la vie, l'apprentissage profond, l'optimisation, l'apprentissage par renforcement et le traitement du langage naturel. Pour promouvoir la recherche sur l'apprentissage tout au long de la vie, Sarath Chandar a créé la Conférence sur les agents d'apprentissage tout au long de la vie (CoLLAs) en 2022 et a présidé le programme en 2022 et en 2023. Il est titulaire d'un doctorat de l'Université de Montréal et d'une maîtrise en recherche de l'Indian Institute of Technology Madras.

Étudiants actuels

Ista Abbes

Maîtrise recherche - UdeM

Davide Baldelli

Doctorat - Polytechnique

Co-superviseur⋅e :

Maîtrise recherche - Polytechnique

Naga Karthik Enamundram

Doctorat - Polytechnique

Superviseur⋅e principal⋅e :

Julien Cohen-Adad

emvnagakarthik@gmail.com

Prashant Govindarajan

Doctorat - Polytechnique

Simon Guiroy

Doctorat - UdeM

Superviseur⋅e principal⋅e :

Collaborateur·rice de recherche - UdeM

Superviseur⋅e principal⋅e :

Doctorat - UdeM

David Heurtel--Depeiges

Doctorat - Polytechnique

Amir Ardalan Kalantari Dehaghi

Jerry Huang

Doctorat - UdeM

Collaborateur·rice alumni

Lola Le Breton

Maîtrise recherche - Polytechnique

Postdoctorat - UdeM

Doctorat - Polytechnique

Roshan Balaji Munirathinam Sankaran Balaji

Mohamed Amine Merzouk

Postdoctorat - Polytechnique

Superviseur⋅e principal⋅e :

Stagiaire de recherche - Polytechnique

Hadi NekoeiQachkanloo

Doctorat - UdeM

Doctorat - UdeM

Doctorat - UdeM

Postdoctorat

Visiteur de recherche indépendant

Mohammad R. Samsami

Maîtrise recherche - UdeM

Maîtrise recherche - Polytechnique

Arjun Vaithilingam Sudhakar

Megh Thakkar

Maîtrise recherche - UdeM

Doctorat - Polytechnique

Kowen Woo

Stagiaire de recherche - Polytechnique

Abdelrahman Zayed

Doctorat - Polytechnique

Xutong Zhao

Doctorat - Polytechnique

Artem Zholus

Doctorat - Polytechnique

NeoBERT: une nouvelle frontière pour les modèles de langage encodeurs open-source

Billets de blogue

A digital picture of Bert from Sesame street, wering black trench coat and sunglasses

3 mars 2025

par

Lola Le Breton

Quentin Fournier

Sarath Chandar

Lire l'article

1 octobre 2024

Comment expliquer l’IA et s’assurer que cette explication est vraie? Les modèles mesurables de fidélité vous indiquent comment y parvenir

par

Andrea Madsen

Siva Reddy

Sarath Chandar

Lire l'article

Publications

Staged independent learning: Towards decentralized cooperative multi-agent Reinforcement Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Amit Sinha

Mohammad Amini

Janarthanan Rajendran

Aditya Mahajan

We empirically show that classic ideas from two-time scale stochastic approximation \citep{borkar1997stochastic} can be combined with sequen… (voir plus)tial iterative best response (SIBR) to solve complex cooperative multi-agent reinforcement learning (MARL) problems. We first start with giving a multi-agent estimation problem as a motivating example where SIBR converges while parallel iterative best response (PIBR) does not. Then we present a general implementation of staged multi-agent RL algorithms based on SIBR and multi-time scale stochastic approximation, and show that our new methods which we call Staged Independent Proximal Policy Optimization (SIPPO) and Staged Independent Q-learning (SIQL) outperform state-of-the-art independent learning on almost all the tasks in the epymarl \citep{papoudakis2020benchmarking} benchmark. This can be seen as a first step towards more decentralized MARL methods based on SIBR and multi-time scale learning.

2022-04-25

ICLR.cc/2022/Workshop/GMS (publié)

openreview.net

Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers

Amir Ardalan Kalantari

Mohammad Saeed Amini

Doina Precup

Much of recent Deep Reinforcement Learning success is owed to the neural architecture's potential to learn and use effective internal repres… (voir plus)entations of the world. While many current algorithms access a simulator to train with a large amount of data, in realistic settings, including while playing games that may be played against people, collecting experience can be quite costly. In this paper, we introduce a deep reinforcement learning architecture whose purpose is to increase sample efficiency without sacrificing performance. We design this architecture by incorporating advances achieved in recent years in the field of Natural Language Processing and Computer Vision. Specifically, we propose a visually attentive model that uses transformers to learn a self-attention mechanism on the feature maps of the state representation, while simultaneously optimizing return. We demonstrate empirically that this architecture improves sample complexity for several Atari environments, while also achieving better performance in some of the games.

2022-02-01

ArXiv (prépublication)

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Gabriele Prato

Simon Guiroy

Ethan Caballero

Irina Rish

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (voir plus) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-13

ArXiv (prépublication)

openreview.net

Local Structure Matters Most: Perturbation Study in NLU

Louis Clouâtre

Prasanna Parthasarathi

Amal Zouaq

Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models … (voir plus)are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.

2021-07-29

ArXiv (preprint)

Continuous Coordination As a Realistic Scenario for Lifelong Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Aaron Courville

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. L… (voir plus)ifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.

2021-07-01

Proceedings of the 38th International Conference on Machine Learning (publié)

proceedings.mlr.press

Chaotic Continual Learning

Touraj Laleh

Mojtaba Faramarzi

Irina Rish

Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (voir plus)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.

2020-07-13

ICML.cc/2020/Workshop/LifelongML (inconnu)

openreview.net

Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning

Sai Krishna Gottipati

B. Sattarov

Sufeng Niu

Yashaswi Pathak

Haoran Wei

Shengchao Liu

Karam M. J. Thomas

Simon R. Blackburn

Connor Wilson. Coley

Jian Tang

Over the last decade, there has been significant progress in the field of machine learning for de novo drug design, particularly in deep gen… (voir plus)erative models. However, current generative approaches exhibit a significant challenge as they do not ensure that the proposed molecular structures can be feasibly synthesized nor do they provide the synthesis routes of the proposed small molecules, thereby seriously limiting their practical applicability. In this work, we propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design, Policy Gradient for Forward Synthesis (PGFS), that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo drug design system. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space by subjecting commercially available small molecule building blocks to valid chemical reactions at every time step of the iterative virtual multi-step synthesis process. The proposed environment for drug discovery provides a highly challenging test-bed for RL algorithms owing to the large state space and high-dimensional continuous action space with hierarchical actions. PGFS achieves state-of-the-art performance in generating structures with high QED and penalized clogP. Moreover, we validate PGFS in an in-silico proof-of-concept associated with three HIV targets. Finally, we describe how the end-to-end training conceptualized in this study represents an important paradigm in radically expanding the synthesizable chemical space and automating the drug discovery process.

2020-01-01

ICML (publié)

proceedings.mlr.press

Toward Training Recurrent Neural Networks for Lifelong Learning

Shagun Sodhani

2020-01-01

Neural Computation (publié)

S UPPLEMENTARY M ATERIAL - L EARNING T O N AVIGATE T HE S YNTHETICALLY A CCESSIBLE C HEMICAL S PACE U SING R EINFORCEMENT L EARNING

Sai Krishna

Gottipati

B. Sattarov

Sufeng Niu

Yashaswi Pathak

Haoran Wei

Shengchao Liu

Karam M. J. Thomas

Simon R. Blackburn

Connor Wilson. Coley

Jian Tang

While updating the critic network, we multiply the normal random noise vector with policy noise of 0.2 and then clip it in the range -0.2 to… (voir plus) 0.2. This clipped policy noise is added to the action at the next time step a′ computed by the target actor networks f and π. The actor networks (f and π networks), target critic and target actor networks are updated once every two updates to the critic network.

Structure Learning for Neural Module Networks

Vardaan Pahuja

Jie Fu

Chris Pal

Neural Module Networks, originally proposed for the task of visual question answering, are a class of neural network architectures that invo… (voir plus)lve human-specified neural modules, each designed for a specific form of reasoning. In current formulations of such networks only the parameters of the neural modules and/or the order of their execution is learned. In this work, we further expand this approach and also learn the underlying internal structure of modules in terms of the ordering and combination of simple and elementary arithmetic operators. We utilize a minimum amount of prior knowledge from the human-specified neural modules in the form of different input types and arithmetic operators used in these modules. Our results show that one is indeed able to simultaneously learn both internal module structure and module sequencing without extra supervisory signals for module execution sequencing. With this approach, we report performance comparable to models using hand-designed modules. In addition, we do a analysis of sensitivity of the learned modules w.r.t. the arithmetic operations and infer the analytical expressions of the learned modules.

2019-11-01

Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN) (publié)

Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies

Chinnadhurai Sankar

Eugene Vorontsov

Samira Ebrahimi Kahou

Modelling long-term dependencies is a challenge for recurrent neural networks. This is primarily due to the fact that gradients vanish durin… (voir plus)g training, as the sequence length increases. Gradients can be attenuated by transition operators and are attenuated or dropped by activation functions. Canonical architectures like LSTM alleviate this issue by skipping information through a memory mechanism. We propose a new recurrent architecture (Non-saturating Recurrent Unit; NRU) that relies on a memory mechanism but forgoes both saturating activation functions and saturating gates, in order to further alleviate vanishing gradients. In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (publié)

Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study

Chinnadhurai Sankar

Sandeep Subramanian

Chris Pal

Neural generative models have been become increasingly popular when building conversational agents. They offer flexibility, can be easily ad… (voir plus)apted to new domains, and require minimal domain engineering. A common criticism of these systems is that they seldom understand or use the available dialog history effectively. In this paper, we take an empirical approach to understanding how these models use the available dialog history by studying the sensitivity of the models to artificially introduced unnatural changes or perturbations to their context at test time. We experiment with 10 different types of perturbations on 4 multi-turn dialog datasets and find that commonly used neural dialog architectures like recurrent and transformer-based seq2seq models are rarely sensitive to most perturbations such as missing or reordering utterances, shuffling words, etc. Also, by open-sourcing our code, we believe that it will serve as a useful diagnostic tool for evaluating dialog systems in the future.

2019-07-01

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (publié)