Publications

Chaotic Continual Learning

Touraj Laleh

Mojtaba Faramarzi

Training a deep neural network requires the model to go over training data for several epochs and update network parameters. In continual le… (see more)arning, this process results in catastrophic forgetting which is one of the core issues of this domain. Most proposed approaches for this issue try to compensate for the effects of parameter updates in the batch incremental setup in which the training model visits a lot of samples for several epochs. However, it is not realistic to expect training data will always be fed to model in a batch incremental setup. This paper proposes a chaotic stream learner that mimics the chaotic behavior of biological neurons and does not updates network parameters. In addition, it can work with fewer samples compared to deep learning models on stream learning setup. Our experiments on MNIST, CIFAR10, and Omniglot show that the chaotic stream learner has less catastrophic forgetting by its nature in comparison to a CNN model in continual learning.

2020-07-13

ICML.cc/2020/Workshop/LifelongML (unknown)

openreview.net

Historical Issue Data of Projects on Jira

A. Nicholson

Deeksha M. Arya

Jin Guo

2020-07-13

(published)

doi.org

S2RMs: Spatially Structured Recurrent Modules

Nasim Rahaman

Anirudh Goyal

Muhammad Waleed Gondal

Manuel Wüthrich

Stefan Bauer

Y. Sharma

Yoshua Bengio

Bernhard Schölkopf

2020-07-13

ArXiv (preprint)

arxiv.org

Towards an Unsupervised Method for Model Selection in Few-Shot Learning

Simon Guiroy

Vikas Verma

Chris Pal

The study of generalization of neural networks in gradient-based meta-learning has recently great research interest. Previous work on the st… (see more)udy of the objective landscapes within the scope of few-shot classiﬁcation empirically demonstrated that generalization to new tasks might be linked to the average inner product between their respective gradients vectors (Guiroy et al., 2019). Following that work, we study the effect that meta-training has on the learned space of representation of the network. Notably, we demonstrate that the global similarity in the space of representation, measured by the average inner product between the embeddings of meta-test examples, also correlates to generalization. Based on these observations, we propose a novel model-selection criterion for gradient-based meta-learning and experimentally validate its effectiveness.

2020-07-13

ICML.cc/2020/Workshop/LifelongML (unknown)

openreview.net

Data-Efficient Reinforcement Learning with Momentum Predictive Representations

Max Schwarzer

Ankesh Anand

Rishab Goel

(Rex) Devon Hjelm

Aaron Courville

Philip Bachman

While deep reinforcement learning excels at solving tasks where large amounts of data can be collected through virtually unlimited interacti… (see more)on with the environment, learning from limited interaction remains a key challenge. We posit that an agent can learn more efficiently if we augment reward maximization with self-supervised objectives based on structure in its visual input and sequential interaction with the environment. Our method, Momentum Predictive Representations (MPR), trains an agent to predict its own latent state representations multiple steps into the future. We compute target representations for future states using an encoder which is an exponential moving average of the agent's parameters, and we make predictions using a learned transition model. On its own, this future prediction objective outperforms prior methods for sample-efficient deep RL from pixels. We further improve performance by adding data augmentation to the future prediction loss, which forces the agent's representations to be consistent across multiple views of an observation. Our full self-supervised objective, which combines future prediction and data augmentation, achieves a median human-normalized score of 0.444 on Atari in a setting limited to 100K steps of environment interaction, which is a 66% relative improvement over the previous state-of-the-art. Moreover, even in this limited data regime, MPR exceeds expert human scores on 6 out of 26 games.

2020-07-12

arXiv.org (preprint)

dblp.uni-trier.de

Material for IEEE Software paper "How Do Open Source Software Contributors Perceive and Address Usability?"

Wenting Wang

Jinghui Cheng

Jin Guo

2020-07-10

(published)

doi.org

Attenuated Anticipation of Social and Monetary Rewards in Autism Spectrum Disorders

Sarah Baumeister

Carolin Moessnang

Nico Bast

Sarah Hohmann

Julian Tillmann

David Goyard

Tony Charman

Sara Ambrosino

Simon Baron-Cohen

Christian Beckmann

Sven Bölte

Thomas Bourgeron

Annika Rausch

Daisy Crawley

Flavio Dell’Acqua

Guillaume Dumas

Sarah Durston

Christine Ecker

Dorothea L. Floris

Vincent Frouin … (see 19 more)

Hannah Hayward

Rosemary Holt

Mark Johnson

Emily J. H. Jones

Meng-Chuan Lai

Michael V. Lombardo

Luke Mason

Marianne Oldehinkel

Tony Persico

Antonia San José Cáceres

Thomas Wolfers

Will Spooren

Eva Loth

Declan Murphy

Jan K. Buitelaar

Heike Tost

Andreas Meyer-Lindenberg

Tobias Banaschewski

Daniel Brandeis

Background Reward processing has been proposed to underpin atypical social behavior, a core feature of autism spectrum disorder (ASD). Howev… (see more)er, previous neuroimaging studies have yielded inconsistent results regarding the specificity of atypicalities for social rewards in ASD. Utilizing a large sample, we aimed to assess altered reward processing in response to reward type (social, monetary) and reward phase (anticipation, delivery) in ASD. Methods Functional magnetic resonance imaging during social and monetary reward anticipation and delivery was performed in 212 individuals with ASD (7.6-30.5 years) and 181 typically developing (TD) participants (7.6-30.8 years). Results Across social and monetary reward anticipation, whole-brain analyses (p0.05, family-wise error-corrected) showed hypoactivation of the right ventral striatum (VS) in ASD. Further, region of interest (ROI) analy

2020-07-06

bioRxiv (preprint)

doi.org

Deep interpretability for GWAS

Deepak Sharma

Audrey Durand

Marc-André Legault

Louis-philippe Lemieux Perreault

Audrey Lemaccon

Marie-Pierre Dub'e

Joelle Pineau

Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In the… (see more)se studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.

2020-07-03

ArXiv (preprint)

arxiv.org

Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs

Kian Ahrabian

Daniel Tarlow

Hehuimin Cheng

Jin Guo

We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest socia… (see more)l coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.

2020-07-02

ArXiv (preprint)

arxiv.org

Compositional Generalization by Factorizing Alignment and Translation

Jacob Russin

Jason Jo

R. O’Reilly

Yoshua Bengio

2020-07-01

Annual Meeting of the Association for Computational Linguistics (published)

doi.org

Counterexamples on the Monotonicity of Delay Optimal Strategies for Energy Harvesting Transmitters

Borna Sayedana

Aditya Mahajan

We consider cross-layer design of delay optimal transmission strategies for energy harvesting transmitters where the data and energy arrival… (see more) processes are stochastic. Using Markov decision theory, we show that the value function is weakly increasing in the queue state and weakly decreasing in the battery state. It is natural to expect that the delay optimal policy should be weakly increasing in the queue and battery states. We show via counterexamples that this is not the case. In fact, we show that for some sample scenarios the delay optimal policy may perform 5–13% better than the best monotone policy.

2020-07-01

IEEE Wireless Communications Letters (published)

doi.org

arxiv.org

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

Wenyu Du

Zhouhan Lin

Yikang Shen

Timothy O'Donnell

Yoshua Bengio

Yue Sara Zhang

It is commonly believed that knowledge of syntactic structure should improve language modeling. However, effectively and computationally eff… (see more)iciently incorporating syntactic structure into neural language models has been a challenging topic. In this paper, we make use of a multi-task objective, i.e., the models simultaneously predict words as well as ground truth parse trees in a form called “syntactic distances”, where information between these two separate objectives shares the same intermediate representation. Experimental results on the Penn Treebank and Chinese Treebank datasets show that when ground truth parse trees are provided as additional training signals, the model is able to achieve lower perplexity and induce trees with better quality.

2020-07-01

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (published)

doi.org

arxiv.org

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Publications

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Mila AI Policy Fellowship

The Development of the UN Scientific Panel on AI

Popular keywords:

Publications