Publications

BabyAI 1.1

David Y. T. Hui

Maxime Chevalier-Boisvert

The BabyAI platform is designed to measure the sample efﬁciency of training an agent to follow grounded-language instructions. BabyAI 1.0 … (see more)presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.1 improves the agent’s architecture in three minor ways. This increases reinforcement learning sample efﬁciency by up to 3 × and improves imitation learning performance on the hardest level from 77% to 90 . 4% . We hope that these improvements increase the computational efﬁciency of BabyAI experiments and help users design better agents.

2021-01-01

(published)

www.semanticscholar.org

Batch Reinforcement Learning Through Continuation Method

Yijie Guo

Shengyu Feng

Nicolas Le Roux

Ed Chi

Honglak Lee

Minmin Chen

Many real-world applications of reinforcement learning (RL) require the agent to learn from a fixed set of trajectories, without collecting … (see more)new interactions. Policy optimization under this setting is extremely challenging as: 1) the geometry of the objective function is hard to optimize efficiently; 2) the shift of data distributions causes high noise in the value estimation. In this work, we propose a simple yet effective policy iteration approach to batch RL using global optimization techniques known as continuation. By constraining the difference between the learned policy and the behavior policy that generates the fixed trajectories, and continuously relaxing the constraint, our method 1) helps the agent escape local optima; 2) reduces the error in policy evaluation in the optimization procedure. We present results on a variety of control tasks, game environments, and a recommendation task to empirically demonstrate the efficacy of our proposed method.

2021-01-01

ICLR (published)

openreview.net

CAMAP: Artificial neural networks unveil the role of 1 codon arrangement in modulating MHC-I peptides 2 presentation

Tariq Daouda

Maude Dumont-Lagacé

Albert Feghaly

Yahya Benslimane

6. Rébecca

Panes

Mathieu Courcelles

Mohamed Benhammadi

Lea Harrington

Pierre Thibault

François Major

Yoshua Bengio

Étienne Gagnon

Sébastien Lemieux

Claude Perreault

30 MHC-I associated peptides (MAPs) play a central role in the elimination of virus-infected and 31 neoplastic cells by CD8 T cells. However… (see more), accurately predicting the MAP repertoire remains 32 difficult, because only a fraction of the transcriptome generates MAPs. In this study, we 33 investigated whether codon arrangement (usage and placement) regulates MAP biogenesis. We 34 developed an artificial neural network called Codon Arrangement MAP Predictor (CAMAP), 35 predicting MAP presentation solely from mRNA sequences flanking the MAP-coding codons 36 (MCCs), while excluding the MCC per se . CAMAP predictions were significantly more accurate 37 when using original codon sequences than shuffled codon sequences which reflect amino acid 38 usage. Furthermore, predictions were independent of mRNA expression and MAP binding affinity 39 to MHC-I molecules and applied to several cell types and species. Combining MAP ligand scores, 40 transcript expression level and CAMAP scores was particularly useful to increaser MAP prediction 41 accuracy. Using an in vitro assay, we showed that varying the synonymous codons in the regions 42 flanking the MCCs (without changing the amino acid sequence) resulted in significant modulation 43 of MAP presentation at the cell surface. Taken together, our results demonstrate the role of codon 44 arrangement in the regulation of MAP presentation and support integration of both translational 45 and post-translational events in predictive algorithms to ameliorate modeling of the 46 immunopeptidome. 47 48 49 they modulated the levels of SIINFEKL presentation in both constructs, but enhanced translation efficiency could only be detected for OVA-RP. These data show that codon arrangement can modulate MAP presentation strength without any changes in the amino

CAMAP: Artificial neural networks unveil the role of 1 codon arrangement in modulating MHC-I peptides 2 presentation discovery of minor histocompatibility with

Tariq Daouda

Maude Dumont-Lagacé

Albert Feghaly

Yahya Benslimane

6. Rébecca

Panes

Mathieu Courcelles

Mohamed Benhammadi

Lea Harrington

Pierre Thibault

François Major

Yoshua Bengio

Étienne Gagnon

Sébastien Lemieux

Claude Perreault

30 MHC-I associated peptides (MAPs) play a central role in the elimination of virus-infected and 31 neoplastic cells by CD8 T cells. However… (see more), accurately predicting the MAP repertoire remains 32 difficult, because only a fraction of the transcriptome generates MAPs. In this study, we 33 investigated whether codon arrangement (usage and placement) regulates MAP biogenesis. We 34 developed an artificial neural network called Codon Arrangement MAP Predictor (CAMAP), 35 predicting MAP presentation solely from mRNA sequences flanking the MAP-coding codons 36 (MCCs), while excluding the MCC per se . CAMAP predictions were significantly more accurate 37 when using original codon sequences than shuffled codon sequences which reflect amino acid 38 usage. Furthermore, predictions were independent of mRNA expression and MAP binding affinity 39 to MHC-I molecules and applied to several cell types and species. Combining MAP ligand scores, 40 transcript expression level and CAMAP scores was particularly useful to increaser MAP prediction 41 accuracy. Using an in vitro assay, we showed that varying the synonymous codons in the regions 42 flanking the MCCs (without changing the amino acid sequence) resulted in significant modulation 43 of MAP presentation at the cell surface. Taken together, our results demonstrate the role of codon 44 arrangement in the regulation of MAP presentation and support integration of both translational 45 and post-translational events in predictive algorithms to ameliorate modeling of the 46 immunopeptidome. 47 48 49 they modulated the levels of SIINFEKL presentation in both constructs, but enhanced translation efficiency could only be detected for OVA-RP. These data show that codon arrangement can modulate MAP presentation strength without any changes in the amino

Can Open Source Licenses Help Regulate Lethal Autonomous Weapons?

Cheng Lin

AJung Moon

Lethal autonomous weapon systems (LAWS, ethal autonomous weapon also known as killer robots) are a real and emerging technology that have th… (see more)e potential to radically transform warfare. Because of the myriad of moral, legal, privacy, and security risks the technology introduces, many scholars and advocates have called for a ban on the development, production, and use of fully autonomous weapons [1], [2].

2021-01-01

IEEE technology & society magazine (published)

doi.org

Can Subnetwork Structure be the Key to Out-of-Distribution Generalization?

Dinghuai Zhang

Kartik Ahuja

Yilun Xu

Yisen Wang

Aaron Courville

Can models with particular structure avoid being biased towards spurious correlation in out-of-distribution (OOD) generalization? Peters et … (see more)al. (2016) provides a positive answer for linear cases. In this paper, we use a functional modular probing method to analyze deep model structures under OOD setting. We demonstrate that even in biased models (which focus on spurious correlation) there still exist unbiased functional subnetworks. Furthermore, we articulate and demonstrate the functional lottery ticket hypothesis: full network contains a subnetwork that can achieve better OOD performance. We then propose Modular Risk Minimization to solve the subnetwork selection problem. Our algorithm learns the subnetwork structure from a given dataset, and can be combined with any other OOD regularization methods. Experiments on various OOD generalization tasks corroborate the effectiveness of our method.

2021-01-01

ICML (published)

proceedings.mlr.press

arxiv.org

Capacity Expansion in the College Admission Problem

Federico Bobbio

Margarida Carvalho

Andrea Lodi

Alfredo Torrico

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data

Jonathan Pilault

Amine El hattami

Chris Pal

Multi-Task Learning (MTL) networks have emerged as a promising method for transferring learned knowledge across different tasks. However, MT… (see more)L must deal with challenges such as: overfitting to low resource tasks, catastrophic forgetting, and negative task transfer, or learning interference. Often, in Natural Language Processing (NLP), a separate model per task is needed to obtain the best performance. However, many fine-tuning approaches are both parameter inefficient, i.e., potentially involving one new model per task, and highly susceptible to losing knowledge acquired during pretraining. We propose a novel Transformer based Hypernetwork Adapter consisting of a new conditional attention mechanism as well as a set of task-conditioned modules that facilitate weight sharing. Through this construction, we achieve more efficient parameter sharing and mitigate forgetting by keeping half of the weights of a pretrained model fixed. We also use a new multi-task data sampling strategy to mitigate the negative effects of data imbalance across tasks. Using this approach, we are able to surpass single task fine-tuning methods while being parameter and data efficient (using around 66% of the data). Compared to other BERT Large methods on GLUE, our 8-task model surpasses other Adapter methods by 2.8% and our 24-task model outperforms by 0.7-1.0% models that use MTL and single task fine-tuning. We show that a larger variant of our single multi-task model approach performs competitively across 26 NLP tasks and yields state-of-the-art results on a number of test and development sets.

2021-01-01

ICLR (published)

openreview.net

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

Harry Zhao

Mingde Zhao

Zhen Liu

Sitao Luan

Shuyuan Zhang

Doina Precup

Yoshua Bengio

We present an end-to-end, model-based deep reinforcement learning agent which dynamically attends to relevant parts of its state during plan… (see more)ning. The agent uses a bottleneck mechanism over a set-based representation to force the number of entities to which the agent attends at each planning step to be small. In experiments, we investigate the bottleneck mechanism with several sets of customized environments featuring different challenges. We consistently observe that the design allows the planning agents to generalize their learned task-solving abilities in compatible unseen environments by attending to the relevant objects, leading to better out-of-distribution generalization performance.

openreview.net

Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Switched Linear Systems

Borna Sayedana

Mohammad Afshari

Peter E. Caines

Aditya Mahajan

In this paper, we investigate the problem of system identiﬁcation for autonomous switched linear systems with complete state observations.… (see more) We propose switched least squares method for the identiﬁcation for switched linear systems, show that this method is strongly consistent, and derive data-dependent and data-independent rates of convergence. In particular, our data-dependent rate of convergence shows that, almost surely, the system identiﬁcation error is O (cid:0)(cid:112) log( T ) /T (cid:1) where T is the time horizon. These results show that our method for switched linear systems has the same rate of convergence as least squares method for non-switched linear systems. We compare our results with those in the literature. We present numerical examples to illustrate the performance of the proposed system identiﬁcation method.

2021-01-01

arXiv.org (preprint)

dblp.uni-trier.de

Continual Learning via Local Module Composition

Oleksiy Ostapenko

Pau Rodriguez

Massimo Caccia

Laurent Charlin

Modularity is a compelling solution to continual learning (CL), the problem of modeling sequences of related tasks. Learning and then compos… (see more)ing modules to solve different tasks provides an abstraction to address the principal challenges of CL including catastrophic forgetting, backward and forward transfer across tasks, and sub-linear model growth. We introduce local module composition (LMC), an approach to modular CL where each module is provided a local structural component that estimates a module's relevance to the input. Dynamic module composition is performed layer-wise based on local relevance scores. We demonstrate that agnosticity to task identities (IDs) arises from (local) structural learning that is module-specific as opposed to the task- and/or model-specific as in previous works, making LMC applicable to more CL settings compared to previous works. In addition, LMC also tracks statistics about the input distribution and adds new modules when outlier samples are detected. In the first set of experiments, LMC performs favorably compared to existing methods on the recent Continual Transfer-learning Benchmark without requiring task identities. In another study, we show that the locality of structural learning allows LMC to interpolate to related but unseen tasks (OOD), as well as to compose modular networks trained independently on different task sequences into a third modular network without any fine-tuning. Finally, in search for limitations of LMC we study it on more challenging sequences of 30 and 100 tasks, demonstrating that local module selection becomes much more challenging in presence of a large number of candidate modules. In this setting best performing LMC spawns much fewer modules compared to an oracle based baseline, however, it reaches a lower overall accuracy. The codebase is available under https://github.com/oleksost/LMC.

openreview.net

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

Rishabh Agarwal

Marlos C. Machado

Pablo Samuel Castro

Marc Gendron-Bellemare

Reinforcement learning methods trained on few environments rarely learn policies that generalize to unseen environments. To improve generali… (see more)zation, we incorporate the inherent sequential structure in reinforcement learning into the representation learning process. This approach is orthogonal to recent approaches, which rarely exploit this structure explicitly. Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states. PSM assigns high similarity to states for which the optimal policies in those states as well as in future states are similar. We also present a contrastive representation learning procedure to embed any state similarity metric, which we instantiate with PSM to obtain policy similarity embeddings (PSEs). We demonstrate that PSEs improve generalization on diverse benchmarks, including LQR with spurious correlations, a jumping task from pixels, and Distracting DM Control Suite.

2021-01-01

ICLR (published)

openreview.net

AI Advantage

Mila AI Policy Fellowship

Leveraging AI for a Sustainable Future

AI Advantage

Mila AI Policy Fellowship

Publications

AI Advantage

Mila AI Policy Fellowship

Leveraging AI for a Sustainable Future

AI Advantage

Mila AI Policy Fellowship

Popular keywords:

Publications