Sarath Chandar Anbil Parthipan

ali-rahimi.kalahroudi@mila.quebec

Biographie

Sarath Chandar est professeur adjoint à Polytechnique Montréal, où il dirige le laboratoire de recherche Chandar. Il est également membre du corps professoral de Mila – Institut québécois d’intelligence artificielle, et titulaire d'une chaire en IA Canada-CIFAR et d'une Chaire de recherche du Canada en apprentissage machine permanent. Ses recherches portent sur l'apprentissage tout au long de la vie, l'apprentissage profond, l'optimisation, l'apprentissage par renforcement et le traitement du langage naturel. Pour promouvoir la recherche sur l'apprentissage tout au long de la vie, Sarath Chandar a créé la Conférence sur les agents d'apprentissage tout au long de la vie (CoLLAs) en 2022 et a présidé le programme en 2022 et en 2023. Il est titulaire d'un doctorat de l'Université de Montréal et d'une maîtrise en recherche de l'Indian Institute of Technology Madras.

Étudiants actuels

Abdelrahman Zayed

Doctorat - Polytechnique Montréal

zayedabd@mila.quebec

Ali Rahimi-Kalahroudi

Maîtrise recherche - Université de Montréal

Amir Ardalan Kalantari Dehaghi

Collaborateur·rice alumni

Doctorat - Polytechnique Montréal

Co-superviseur⋅e :

Siva Reddy

andreas.madsen@mila.quebec

Maîtrise recherche - Polytechnique Montréal

antoine.clavaud@mila.quebec

Arjun Vaithilingam Sudhakar

Doctorat - Polytechnique Montréal

arjun.vaithilingam-sudhakar@mila.quebec

Artem Zholus

Doctorat - Polytechnique Montréal

artem.zholus@mila.quebec

Doctorat - Université de Montréal

darshan.patil@mila.quebec

francois.rivest@mila.quebec

Francois Rivest

Visiteur de recherche indépendant

goncalo-filipe.torcato-mordido@mila.quebec

Gabriele Prato

Doctorat - Université de Montréal

Postdoctorat - Polytechnique Montréal

Hadi NekoeiQachkanloo

Doctorat - Université de Montréal

nekoeihe@mila.quebec

Ista Abbes

Maîtrise recherche - Université de Montréal

istabrak.abbes@mila.quebec

Janarthanan Rajendran

Postdoctorat - Université de Montréal

Co-superviseur⋅e :

Doina Precup

janarthanan.rajendran@mila.quebec

jarrid.rector-brooks@mila.quebec

Jarrid Rector-Brooks

Doctorat - Université de Montréal

Superviseur⋅e principal⋅e :

Yoshua Bengio

Jerry Huang

Maîtrise recherche - Université de Montréal

jerry.huang@mila.quebec

Kshitij Gupta

Collaborateur·rice alumni - Université de Montréal

Superviseur⋅e principal⋅e :

Irina Rish

kshitij.gupta@mila.quebec

Lola Le Breton

Maîtrise recherche - Polytechnique Montréal

lola.lebreton@mila.quebec

Louis Clouatre

Doctorat - Polytechnique Montréal

Superviseur⋅e principal⋅e :

maryam.hashemzadeh@mila.quebec

clouatrl@mila.quebec

Maryam Hashemzadeh

Doctorat - None

Mathieu Duchesneau

Doctorat - Université de Montréal

Postdoctorat

mathieu.reymond@mila.quebec

Maziar Sargordi

Doctorat - Polytechnique Montréal

Co-superviseur⋅e :

maziar.sargordi@mila.quebec

mohammad-reza.samsami@mila.quebec

Megh Thakkar

Maîtrise recherche - Université de Montréal

megh.thakkar@mila.quebec

Mohammad R. Samsami

Maîtrise recherche - Université de Montréal

naga-karthik.enamundram@mila.quebec

Naga Karthik Enamundram

Doctorat - Polytechnique Montréal

Superviseur⋅e principal⋅e :

Julien Cohen-Adad

Doctorat - Polytechnique Montréal

pranshu.malviya@mila.quebec

prashant.govindarajan@mila.quebec

Prashant Govindarajan

Doctorat - Polytechnique Montréal

Simon Guiroy

Doctorat - Université de Montréal

Superviseur⋅e principal⋅e :

Xutong Zhao

Doctorat - Polytechnique Montréal

xutong.zhao@mila.quebec

Publications

Replay Buffer with Local Forgetting for Adapting to Local Environment Changes in Deep Model-Based Reinforcement Learning

Ali Rahimi-Kalahroudi

Janarthanan Rajendran

Ida Momennejad

Harm van Seijen

2023-01-01

CoLLAs (publié)

proceedings.mlr.press

Self-Influence Guided Data Reweighting for Language Model Pre-training

Megh Thakkar

Tolga Bolukbasi

Sriram Ganapathy

Shikhar Vashishth

Partha Talukdar

Language Models (LMs) pre-trained with selfsupervision on large text corpora have become the default starting point for developing models fo… (voir plus)r various NLP tasks. Once the pre-training corpus has been assembled, all data samples in the corpus are treated with equal importance during LM pre-training. However, due to varying levels of relevance and quality of data, equal importance to all the data samples may not be the optimal choice. While data reweighting has been explored in the context of task-specific supervised learning and LM fine-tuning, model-driven reweighting for pretraining data has not been explored. We fill this important gap and propose PRESENCE, a method for jointly reweighting samples by leveraging self-influence (SI) scores as an indicator of sample importance and pre-training. PRESENCE promotes novelty and stability for model pre-training. Through extensive analysis spanning multiple model sizes, datasets, and tasks, we present PRESENCE as an important first step in the research direction of sample reweighting for pre-training language models.

2023-01-01

EMNLP (publié)

Post-hoc Interpretability for Neural NLP: A Survey

Andreas Madsen

Siva Reddy

2022-12-23

ACM Computing Surveys (publié)

Replay Buffer With Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning

Ali Rahimi-Kalahroudi

Janarthanan Rajendran

Ida Momennejad

Harm van Seijen

One of the key behavioral characteristics used in neuroscience to determine whether the subject of study—be it a rodent or a human—exhib… (voir plus)its model-based learning is effective adaptation to local changes in the environment. In reinforcement learning, however, recent work has shown that modern deep model-based reinforcement-learning (MBRL) methods adapt poorly to such changes. An explanation for this mismatch is that MBRL methods are typically designed with sample-efﬁciency on a single task in mind and the requirements for effective adaptation are substantially higher, both in terms of the learned world model and the planning routine. One particularly challenging requirement is that the learned world model has to be sufﬁciently accurate throughout relevant parts of the state-space. This is challenging for deep-learning-based world models due to catastrophic forgetting. And while a replay buffer can mitigate the effects of catastrophic forgetting, the traditional ﬁrst-in-ﬁrst-out replay buffer precludes effective adaptation due to maintaining stale data. In this work

2022-12-09

NeurIPS.cc/2022/Workshop/DeepRL (inconnu)

PatchBlender: A Motion Prior for Video Transformers

Gabriele Prato

Yale Song

Janarthanan Rajendran

(Rex) Devon Hjelm

Neel Joshi

2022-11-11

ArXiv (prépublication)

Local Structure Matters Most: Perturbation Study in NLU

Louis Clouâtre

Prasanna Parthasarathi

Recent research analyzing the sensitivity of natural language understanding models to word-order perturbations has shown that neural models … (voir plus)are surprisingly insensitive to the order of words.In this paper, we investigate this phenomenon by developing order-altering perturbations on the order of words, subwords, and characters to analyze their effect on neural models’ performance on language understanding tasks.We experiment with measuring the impact of perturbations to the local neighborhood of characters and global position of characters in the perturbed texts and observe that perturbation functions found in prior literature only affect the global ordering while the local ordering remains relatively unperturbed.We empirically show that neural models, invariant of their inductive biases, pretraining scheme, or the choice of tokenization, mostly rely on the local structure of text to build understanding and make limited use of the global structure.

2022-05-01

Findings of the Association for Computational Linguistics: ACL 2022 (publié)

Staged independent learning: Towards decentralized cooperative multi-agent Reinforcement Learning

Hadi Nekoei

Akilesh Badrinaaraayanan

Amit Sinha

Mohammad Amini

Janarthanan Rajendran

Aditya Mahajan

We empirically show that classic ideas from two-time scale stochastic approximation \citep{borkar1997stochastic} can be combined with sequen… (voir plus)tial iterative best response (SIBR) to solve complex cooperative multi-agent reinforcement learning (MARL) problems. We first start with giving a multi-agent estimation problem as a motivating example where SIBR converges while parallel iterative best response (PIBR) does not. Then we present a general implementation of staged multi-agent RL algorithms based on SIBR and multi-time scale stochastic approximation, and show that our new methods which we call Staged Independent Proximal Policy Optimization (SIPPO) and Staged Independent Q-learning (SIQL) outperform state-of-the-art independent learning on almost all the tasks in the epymarl \citep{papoudakis2020benchmarking} benchmark. This can be seen as a first step towards more decentralized MARL methods based on SIBR and multi-time scale learning.

2022-04-25

ICLR.cc/2022/Workshop/GMS (publié)

Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers

Gabriele Prato

Simon Guiroy

Ethan Caballero

Irina Rish

Empirical science of neural scaling laws is a rapidly growing area of significant importance to the future of machine learning, particularly… (voir plus) in the light of recent breakthroughs achieved by large-scale pre-trained models such as GPT-3, CLIP and DALL-e. Accurately predicting the neural network performance with increasing resources such as data, compute and model size provides a more comprehensive evaluation of different approaches across multiple scales, as opposed to traditional point-wise comparisons of fixed-size models on fixed-size benchmarks, and, most importantly, allows for focus on the best-scaling, and thus most promising in the future, approaches. In this work, we consider a challenging problem of few-shot learning in image classification, especially when the target data distribution in the few-shot phase is different from the source, training, data distribution, in a sense that it includes new image classes not encountered during training. Our current main goal is to investigate how the amount of pre-training data affects the few-shot generalization performance of standard image classifiers. Our key observations are that (1) such performance improvements are well-approximated by power laws (linear log-log plots) as the training set size increases, (2) this applies to both cases of target data coming from either the same or from a different domain (i.e., new classes) as the training data, and (3) few-shot performance on new classes converges at a faster rate than the standard classification performance on previously seen classes. Our findings shed new light on the relationship between scale and generalization.

2021-10-13

ArXiv (prépublication)

Local Structure Matters Most: Perturbation Study in NLU

Louis Clouâtre

Prasanna Parthasarathi

2021-07-29

ArXiv (preprint)

Chaotic Continual Learning

Touraj Laleh

Mojtaba Faramarzi

Irina Rish

Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (voir plus)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.

2020-07-13

ICML.cc/2020/Workshop/LifelongML (inconnu)

Environments for Lifelong Reinforcement Learning

Khimya Khetarpal

Shagun Sodhani

Doina Precup

To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific ta… (voir plus)sk but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned. In this paper, we discuss the desired characteristics of environments that can support the training and evaluation of lifelong reinforcement learning agents, review existing environments from this perspective, and propose recommendations for devising suitable environments in the future.

2018-11-26

ArXiv (prépublication)