Sarath Chandar Anbil Parthipan

ali-rahimi.kalahroudi@mila.quebec

Biography

Sarath Chandar is an assistant professor at Polytechnique Montréal, where he leads the Chandar Research Lab. He is also a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

Abdelrahman Zayed

PhD - Polytechnique Montréal

zayedabd@mila.quebec

Ali Rahimi-Kalahroudi

Master's Research - Université de Montréal

Amir Ardalan Kalantari Dehaghi

Collaborating Alumni

PhD - Polytechnique Montréal

Co-supervisor :

Siva Reddy

andreas.madsen@mila.quebec

Master's Research - Polytechnique Montréal

antoine.clavaud@mila.quebec

Arjun Vaithilingam Sudhakar

PhD - Polytechnique Montréal

arjun.vaithilingam-sudhakar@mila.quebec

Artem Zholus

PhD - Polytechnique Montréal

artem.zholus@mila.quebec

PhD - Université de Montréal

darshan.patil@mila.quebec

francois.rivest@mila.quebec

Francois Rivest

Independent visiting researcher

goncalo-filipe.torcato-mordido@mila.quebec

Gabriele Prato

PhD - Université de Montréal

Postdoctorate - Polytechnique Montréal

Hadi NekoeiQachkanloo

PhD - Université de Montréal

nekoeihe@mila.quebec

Ista Abbes

Master's Research - Université de Montréal

istabrak.abbes@mila.quebec

Janarthanan Rajendran

Postdoctorate - Université de Montréal

Co-supervisor :

Doina Precup

janarthanan.rajendran@mila.quebec

jarrid.rector-brooks@mila.quebec

Jarrid Rector-Brooks

PhD - Université de Montréal

Principal supervisor :

Yoshua Bengio

Jerry Huang

Master's Research - Université de Montréal

jerry.huang@mila.quebec

Kshitij Gupta

Collaborating Alumni - Université de Montréal

Principal supervisor :

Irina Rish

kshitij.gupta@mila.quebec

Lola Le Breton

Master's Research - Polytechnique Montréal

lola.lebreton@mila.quebec

Louis Clouatre

PhD - Polytechnique Montréal

Principal supervisor :

maryam.hashemzadeh@mila.quebec

clouatrl@mila.quebec

Maryam Hashemzadeh

PhD - None

Mathieu Duchesneau

PhD - Université de Montréal

Postdoctorate

mathieu.reymond@mila.quebec

Maziar Sargordi

PhD - Polytechnique Montréal

Co-supervisor :

maziar.sargordi@mila.quebec

mohammad-reza.samsami@mila.quebec

Megh Thakkar

Master's Research - Université de Montréal

megh.thakkar@mila.quebec

Mohammad R. Samsami

Master's Research - Université de Montréal

naga-karthik.enamundram@mila.quebec

Naga Karthik Enamundram

PhD - Polytechnique Montréal

Principal supervisor :

Julien Cohen-Adad

PhD - Polytechnique Montréal

pranshu.malviya@mila.quebec

prashant.govindarajan@mila.quebec

Prashant Govindarajan

PhD - Polytechnique Montréal

Simon Guiroy

PhD - Université de Montréal

Principal supervisor :

Xutong Zhao

PhD - Polytechnique Montréal

xutong.zhao@mila.quebec

Publications

Replay Buffer with Local Forgetting for Adapting to Local Environment Changes in Deep Model-Based Reinforcement Learning

Ali Rahimi-Kalahroudi

Janarthanan Rajendran

Ida Momennejad

Harm van Seijen

2023-01-01

CoLLAs (published)

proceedings.mlr.press

Self-Influence Guided Data Reweighting for Language Model Pre-training

Megh Thakkar

Tolga Bolukbasi

Sriram Ganapathy

Shikhar Vashishth

Partha Talukdar

Language Models (LMs) pre-trained with selfsupervision on large text corpora have become the default starting point for developing models fo… (see more)r various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pretraining data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.

2023-01-01

EMNLP (published)

Post-hoc Interpretability for Neural NLP: A Survey

Andreas Madsen

Siva Reddy

2022-12-23

ACM Computing Surveys (published)

Replay Buffer With Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning

Ali Rahimi-Kalahroudi

Janarthanan Rajendran

Ida Momennejad

Harm van Seijen

One of the key behavioral characteristics used in neuroscience to determine whether the subject of study—be it a rodent or a human—exhib… (see more)its model-based learning is effective adaptation to local changes in the environment. In reinforcement learning, however, recent work has shown that modern deep model-based reinforcement-learning (MBRL) methods adapt poorly to such changes. An explanation for this mismatch is that MBRL methods are typically designed with sample-efﬁciency on a single task in mind and the requirements for effective adaptation are substantially higher, both in terms of the learned world model and the planning routine. One particularly challenging requirement is that the learned world model has to be sufﬁciently accurate throughout relevant parts of the state-space. This is challenging for deep-learning-based world models due to catastrophic forgetting. And while a replay buffer can mitigate the effects of catastrophic forgetting, the traditional ﬁrst-in-ﬁrst-out replay buffer precludes effective adaptation due to maintaining stale data. In this work

2022-12-09

NeurIPS.cc/2022/Workshop/DeepRL (unknown)

PatchBlender: A Motion Prior for Video Transformers

Gabriele Prato

Yale Song

Janarthanan Rajendran

(Rex) Devon Hjelm

Neel Joshi

2022-11-11

ArXiv (preprint)

Local Structure Matters Most: Perturbation Study in NLU

Louis Clouâtre

Prasanna Parthasarathi

Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models … (see more)are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.

2022-05-01

Findings of the Association for Computational Linguistics: ACL 2022 (published)

Staged independent learning: Towards decentralized cooperative multi-agent Reinforcement Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Amit Sinha

Mohammad Amini

Janarthanan Rajendran

Aditya Mahajan

We empirically show that classic ideas from two-time scale stochastic approximation \citep{borkar1997stochastic} can be combined with sequen… (see more)tial iterative best response (SIBR) to solve complex cooperative multi-agent reinforcement learning (MARL) problems. We first start with giving a multi-agent estimation problem as a motivating example where SIBR converges while parallel iterative best response (PIBR) does not. Then we present a general implementation of staged multi-agent RL algorithms based on SIBR and multi-time scale stochastic approximation, and show that our new methods which we call Staged Independent Proximal Policy Optimization (SIPPO) and Staged Independent Q-learning (SIQL) outperform state-of-the-art independent learning on almost all the tasks in the epymarl \citep{papoudakis2020benchmarking} benchmark. This can be seen as a first step towards more decentralized MARL methods based on SIBR and multi-time scale learning.

2022-04-25

ICLR.cc/2022/Workshop/GMS (published)

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Gabriele Prato

Simon Guiroy

Ethan Caballero

Irina Rish

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (see more) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-13

ArXiv (preprint)

Local Structure Matters Most: Perturbation Study in NLU

Louis Clouâtre

Prasanna Parthasarathi

2021-07-29

ArXiv (preprint)

Chaotic Continual Learning

Touraj Laleh

Mojtaba Faramarzi

Irina Rish

Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (see more)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.

2020-07-13

ICML.cc/2020/Workshop/LifelongML (unknown)

Environments for Lifelong Reinforcement Learning

Khimya Khetarpal

Shagun Sodhani

Doina Precup

To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific ta… (see more)sk but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned. In this paper, we discuss the desired characteristics of environments that can support the training and evaluation of lifelong reinforcement learning agents, review existing environments from this perspective, and propose recommendations for devising suitable environments in the future.

2018-11-26

ArXiv (preprint)