Publications

Untangling tradeoffs between recurrence and self-attention in artificial neural networks

Giancarlo Kerg

Bhargav Kanuparthi

Anirudh Goyal

Kyle Goyette

S UPPLEMENTARY M ATERIAL - L EARNING T O N AVIGATE T HE S YNTHETICALLY A CCESSIBLE C HEMICAL S PACE U SING R EINFORCEMENT L EARNING

Sai Krishna

Gottipati

B. Sattarov

Sufeng Niu

Yashaswi Pathak

Haoran Wei

Shengchao Liu

Karam M. J. Thomas

Simon R. Blackburn

Connor Wilson. Coley

Jian Tang

Sarath Chandar

Yoshua Bengio

While updating the critic network, we multiply the normal random noise vector with policy noise of 0.2 and then clip it in the range -0.2 to… (see more) 0.2. This clipped policy noise is added to the action at the next time step a′ computed by the target actor networks f and π. The actor networks (f and π networks), target critic and target actor networks are updated once every two updates to the critic network.

Value-driven Hindsight Modelling

Arthur Guez

Fabio Viola

Theophane Weber

Lars Buesing

Steven Kapturowski

Doina Precup

David Silver

Nicolas Heess

Value estimation is a critical component of the reinforcement learning (RL) paradigm. The question of how to effectively learn predictors fo… (see more)r value from data is one of the major problems studied by the RL community, and different approaches exploit structure in the problem domain in different ways. Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function. In contrast, model-free methods directly leverage the quantity of interest from the future but have to compose with a potentially weak scalar signal (an estimate of the return). In this paper we develop an approach for representation learning in RL that sits in between these two extremes: we propose to learn what to model in a way that can directly help value prediction. To this end we determine which features of the future trajectory provide useful information to predict the associated return. This provides us with tractable prediction targets that are directly relevant for a task, and can thus accelerate learning of the value function. The idea can be understood as reasoning, in hindsight, about which aspects of the future observations could help past value prediction. We show how this can help dramatically even in simple policy evaluation settings. We then test our approach at scale in challenging domains, including on 57 Atari 2600 games.

openreview.net

On Variational Learning of Controllable Representations for Text without Supervision

Peng Xu

Jackie Cheung

Yanshuai Cao

The variational autoencoder (VAE) can learn the manifold of natural images on certain datasets, as evidenced by meaningful interpolating or … (see more)extrapolating in the continuous latent space. However, on discrete data such as text, it is unclear if unsupervised learning can discover similar latent space that allows controllable manipulation. In this work, we find that sequence VAEs trained on text fail to properly decode when the latent codes are manipulated, because the modified codes often land in holes or vacant regions in the aggregated posterior latent space, where the decoding network fails to generalize. Both as a validation of the explanation and as a fix to the problem, we propose to constrain the posterior mean to a learned probability simplex, and performs manipulation within this simplex. Our proposed method mitigates the latent vacancy problem and achieves the first success in unsupervised learning of controllable representations for text. Empirically, our method outperforms unsupervised baselines and strong supervised approaches on text style transfer, and is capable of performing more flexible fine-grained control over text generation than existing methods.

2020-01-01

ICML (published)

proceedings.mlr.press

openreview.net

You could have said that instead: Improving Chatbots with Natural Language Feedback

Makesh Narsimhan Sreedhar

Kun Ni

Siva Reddy

The ubiquitous nature of dialogue systems and their interaction with users generate an enormous amount of data. Can we improve chatbots usin… (see more)g this data? A self-feeding chatbot improves itself by asking natural language feedback when a user is dissatisfied with its response and uses this feedback as an additional training sample. However, user feedback in most cases contains extraneous sequences hindering their usefulness as a training sample. In this work, we propose a generative adversarial model that converts noisy feedback into a plausible natural response in a conversation. The generator’s goal is to convert the feedback into a response that answers the user’s previous utterance and to fool the discriminator which distinguishes feedback from natural responses. We show that augmenting original training data with these modified feedback responses improves the original chatbot performance from 69.94%to 75.96% in ranking correct responses on the PERSONACHATdataset, a large improvement given that the original model is already trained on 131k samples.

2020-01-01

Conference on Empirical Methods in Natural Language Processing (published)

doi.org

arxiv.org

Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling

Tong Che

Ruixiang ZHANG

Jascha Sohl-Dickstein

Hugo Larochelle

Liam Paull

Yuan Cao

Yoshua Bengio

We show that the sum of the implicit generator log-density …

arxiv.org

Learning from Learning Machines: Optimisation, Rules, and Social Norms

Travis LaCroix

Yoshua Bengio

There is an analogy between machine learning systems and economic entities in that they are both adaptive, and their behaviour is specified … (see more)in a more-or-less explicit way. It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making, but it is an open question as to how precisely moral behaviour can be achieved in an AI system. This paper explores the analogy between these two complex systems, and we suggest that a clearer understanding of this apparent analogy may help us forward in both the socio-economic domain and the AI domain: known results in economics may help inform feasible solutions in AI safety, but also known results in AI may inform economic policy. If this claim is correct, then the recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems.

2019-12-29

ArXiv (preprint)

arxiv.org

CLOSURE: Assessing Systematic Generalization of CLEVR Models

Dzmitry Bahdanau

Harm de Vries

Timothy O'Donnell

Shikhar Murty

Philippe Beaudoin

Yoshua Bengio

Aaron Courville

2019-12-12

ArXiv (preprint)

arxiv.org

Interactive Psychometrics for Autism with the Human Dynamic Clamp: Interpersonal Synchrony from Sensory-motor to Socio-cognitive Domains

Florence Baillin

Aline Lefebvre

Amandine Pedoux

Yann Beauxis

Denis-Alexander Engemann

Anna Maruani

Frederique Amsellem

Thomas Bourgeron

Richard Delorme

Guillaume Dumas

2019-12-06

(published)

doi.org

Neuropsychiatric mutations delineate functional brain connectivity dimensions contributing to autism and schizophrenia

Clara A. Moreau

Sebastian Urchs

Pierre Orban

Catherine Schramm

Guillaume Dumas

Aurélie Labbe

Guillaume Huguet

Elise Douard

Pierre-Olivier Quirion

Amy Lin

Leila Kushan

Stephanie Grot

David Luck

Adrianna Mendrek

Stephane Potvin

Emmanuel Stip

Thomas Bourgeron

Alan C. Evans

Carrie E. Bearden

Lune Bellec … (see 1 more)

Sébastien Jacquemont

16p11.2 and 22q11.2 Copy Number Variants (CNVs) confer high risk for Autism Spectrum Disorder (ASD), schizophrenia (SZ), and Attention-Defic… (see more)it-Hyperactivity-Disorder (ADHD), but their impact on functional connectivity (FC) remains unclear. We analyzed resting-state functional magnetic resonance imaging data from 101 CNV carriers, 755 individuals with idiopathic ASD, SZ, or ADHD and 1,072 controls. We used CNV FC-signatures to identify dimensions contributing to complex idiopathic conditions. CNVs had large mirror effects on FC at the global and regional level. Thalamus, somatomotor, and posterior insula regions played a critical role in dysconnectivity shared across deletions, duplications, idiopathic ASD, SZ but not ADHD. Individuals with higher similarity to deletion FC-signatures exhibited worse cognitive and behavioral symptoms. Deletion similarities identified at the connectivity level could be related to the redundant associations observed genome-wide between gene expression spatial patterns and FC-signatures. Results may explain why many CNVs affect a similar range of neuropsychiatric symptoms.

2019-12-06

bioRxiv (preprint)

doi.org

Applying Knowledge Transfer for Water Body Segmentation in Peru

Jessenia Gonzalez

Debjani Bhowmick

César Beltrán

Kris Sankaran

Yoshua Bengio

2019-12-02

ArXiv (preprint)

arxiv.org

Detecting GAN generated errors

Xiru Zhu

Fengdi Che

Tianzi Yang

Tzuyang Yu

David Meger

Gregory Dudek

Despite an impressive performance from the latest GAN for generating hyper-realistic images, GAN discriminators have difficulty evaluating t… (see more)he quality of an individual generated sample. This is because the task of evaluating the quality of a generated image differs from deciding if an image is real or fake. A generated image could be perfect except in a single area but still be detected as fake. Instead, we propose a novel approach for detecting where errors occur within a generated image. By collaging real images with generated images, we compute for each pixel, whether it belongs to the real distribution or generated distribution. Furthermore, we leverage attention to model long-range dependency; this allows detection of errors which are reasonable locally but not holistically. For evaluation, we show that our error detection can act as a quality metric for an individual image, unlike FID and IS. We leverage Improved Wasserstein, BigGAN, and StyleGAN to show a ranking based on our metric correlates impressively with FID scores. Our work opens the door for better understanding of GAN and the ability to select the best samples from a GAN model.

2019-12-02

ArXiv (preprint)

arxiv.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications