Portrait of Sarath Chandar Anbil Parthipan

Sarath Chandar Anbil Parthipan

Core Academic Member
Canada CIFAR AI Chair
Assistant Professor, Polytechnique Montréal, Department of Computer Engineering and Software Engineering
Adjunct Professor, Université de Montréal, Department of Computer Science and Operations Research
Indian Institute of Technology Madras

Biography

Sarath Chandar is an assistant professor at Polytechnique Montréal, where he leads the Chandar Research Lab. He is also a core academic member of Mila – Quebec Artificial Intelligence Institute and holds a Canada CIFAR AI Chair and the Canada Research Chair in Lifelong Machine Learning.

Chandar’s research interests include lifelong learning, deep learning, optimization, reinforcement learning and natural language processing. To promote research in lifelong learning, Chandar created the Conference on Lifelong Learning Agents (CoLLAs) in 2022, for which he served as program chair in 2022 and 2023.

He has a PhD from Université de Montréal and an MSc (By Research) from the Indian Institute of Technology Madras.

Current Students

PhD - Polytechnique Montréal
Master's Research - Université de Montréal
PhD - Polytechnique Montréal
Co-supervisor :
Master's Research - Polytechnique Montréal
PhD - Université de Montréal
Postdoctorate - Polytechnique Montréal
PhD - Université de Montréal
Master's Research - Université de Montréal
Postdoctorate - Université de Montréal
Co-supervisor :
PhD - Université de Montréal
Principal supervisor :
Master's Research - Université de Montréal
Collaborating Alumni - Université de Montréal
Principal supervisor :
Master's Research - Polytechnique Montréal
PhD - Polytechnique Montréal
Principal supervisor :
PhD - Polytechnique Montréal
Co-supervisor :
Master's Research - Université de Montréal
PhD - Université de Montréal
Principal supervisor :
PhD - Polytechnique Montréal

Publications

Replay Buffer with Local Forgetting for Adapting to Local Environment Changes in Deep Model-Based Reinforcement Learning
Ali Rahimi-Kalahroudi
Janarthanan Rajendran
Ida Momennejad
Harm van Seijen
Self-Influence Guided Data Reweighting for Language Model Pre-training
Megh Thakkar
Tolga Bolukbasi
Sriram Ganapathy
Shikhar Vashishth
Partha Talukdar
Language Models (LMs) pre-trained with selfsupervision on large text corpora have become the default starting point for developing models fo… (see more)r various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pretraining data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.
Post-hoc Interpretability for Neural NLP: A Survey
Replay Buffer With Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning
Ali Rahimi-Kalahroudi
Janarthanan Rajendran
Ida Momennejad
Harm van Seijen
One of the key behavioral characteristics used in neuroscience to determine whether the subject of study—be it a rodent or a human—exhib… (see more)its model-based learning is effective adaptation to local changes in the environment. In reinforcement learning, however, recent work has shown that modern deep model-based reinforcement-learning (MBRL) methods adapt poorly to such changes. An explanation for this mismatch is that MBRL methods are typically designed with sample-efficiency on a single task in mind and the requirements for effective adaptation are substantially higher, both in terms of the learned world model and the planning routine. One particularly challenging requirement is that the learned world model has to be sufficiently accurate throughout relevant parts of the state-space. This is challenging for deep-learning-based world models due to catastrophic forgetting. And while a replay buffer can mitigate the effects of catastrophic forgetting, the traditional first-in-first-out replay buffer precludes effective adaptation due to maintaining stale data. In this work
PatchBlender: A Motion Prior for Video Transformers
Gabriele Prato
Yale Song
Janarthanan Rajendran
Neel Joshi
Local Structure Matters Most: Perturbation Study in NLU
Louis Clouâtre
Prasanna Parthasarathi
Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models … (see more)are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.
Staged independent learning: Towards decentralized cooperative multi-agent Reinforcement Learning
Hadi Nekoei
Akilesh Badrinaaraayanan
Amit Sinha
Mohammad Amini
Janarthanan Rajendran
We empirically show that classic ideas from two-time scale stochastic approximation \citep{borkar1997stochastic} can be combined with sequen… (see more)tial iterative best response (SIBR) to solve complex cooperative multi-agent reinforcement learning (MARL) problems. We first start with giving a multi-agent estimation problem as a motivating example where SIBR converges while parallel iterative best response (PIBR) does not. Then we present a general implementation of staged multi-agent RL algorithms based on SIBR and multi-time scale stochastic approximation, and show that our new methods which we call Staged Independent Proximal Policy Optimization (SIPPO) and Staged Independent Q-learning (SIQL) outperform state-of-the-art independent learning on almost all the tasks in the epymarl \citep{papoudakis2020benchmarking} benchmark. This can be seen as a first step towards more decentralized MARL methods based on SIBR and multi-time scale learning.
Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers
Gabriele Prato
Simon Guiroy
Ethan Caballero
Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (see more) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.
Local Structure Matters Most: Perturbation Study in NLU
Louis Clouâtre
Prasanna Parthasarathi
Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models … (see more)are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.
Chaotic Continual Learning
Touraj Laleh
Mojtaba Faramarzi
Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (see more)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.
Environments for Lifelong Reinforcement Learning
To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific ta… (see more)sk but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned. In this paper, we discuss the desired characteristics of environments that can support the training and evaluation of lifelong reinforcement learning agents, review existing environments from this perspective, and propose recommendations for devising suitable environments in the future.