Publications

Towards Text Generation with Adversarially Learned Neural Outlines.

Sandeep Subramanian

Sai Rajeswar

Alessandro Sordoni

Adam Trischler

Aaron C. Courville

Chris Pal

2017-12-31

Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (published)

dblp.uni-trier.de

Trends and Applications in Knowledge Discovery and Data Mining

Lida Rashidi

Benjamin C. M. Fung

Can Wang

2017-12-31

Lecture Notes in Computer Science (published)

doi.org

Twin Networks: Matching the Future for Sequence Generation

Dmitriy Serdyuk

Nan Rosemary Ke

Alessandro Sordoni

Adam Trischler

Christopher Pal

Yoshua Bengio

We propose a simple technique for encouraging generative RNNs to plan ahead. We train a "backward" recurrent network to generate a given seq… (see more)uence in reverse order, and we encourage states of the forward model to predict cotemporal states of the backward model. The backward network is used only during training, and plays no role during sampling or inference. We hypothesize that our approach eases modeling of long-term dependencies by implicitly forcing the forward states to hold information about the longer-term future (as contained in the backward states). We show empirically that our approach achieves 9% relative improvement for a speech recognition task, and achieves significant improvement on a COCO caption generation task.

2017-12-31

ICLR (Poster) (published)

openreview.net

Universal Successor Representations for Transfer Reinforcement Learning

Chen Ma

Junfeng Wen

Yoshua Bengio

The objective of transfer reinforcement learning is to generalize from a set of previous tasks to unseen new tasks. In this work, we focus o… (see more)n the transfer scenario where the dynamics among tasks are the same, but their goals differ. Although general value function (Sutton et al., 2011) has been shown to be useful for knowledge transfer, learning a universal value function can be challenging in practice. To attack this, we propose (1) to use universal successor representations (USR) to represent the transferable knowledge and (2) a USR approximator (USRA) that can be trained by interacting with the environment. Our experiments show that USR can be effectively applied to new tasks, and the agent initialized by the trained USRA can achieve the goal considerably faster than random initialization.

2017-12-31

ICLR (Workshop) (published)

openreview.net

Unsupervised Depth Estimation, 3D Face Rotation and Replacement

Joel Ruben Antony Moniz

Christopher Beckham

Simon Rajotte

Sina Honari

Christopher Pal

We present an unsupervised approach for learning to estimate three dimensional (3D) facial structure from a single image while also predicti… (see more)ng 3D viewpoint transformations that match a desired pose and facial geometry. We achieve this by inferring the depth of facial keypoints of an input image in an unsupervised manner, without using any form of ground-truth depth information. We show how it is possible to use these depths as intermediate computations within a new backpropable loss to predict the parameters of a 3D affine transformation matrix that maps inferred 3D keypoints of an input face to the corresponding 2D keypoints on a desired target facial geometry or pose. Our resulting approach, called DepthNets, can therefore be used to infer plausible 3D transformations from one face pose to another, allowing faces to be frontalized, transformed into 3D models or even warped to another pose and facial geometry. Lastly, we identify certain shortcomings with our formulation, and explore adversarial image translation techniques as a post-processing step to re-synthesize complete head shots for faces re-targeted to different poses or identities.

2017-12-31

Advances in Neural Information Processing Systems 31 (NeurIPS 2018) (published)

arxiv.org

Boosting Based Multiple Kernel Learning and Transfer Regression for Electricity Load Forecasting

Di Wu

Boyu Wang

Doina Precup

Benoit Boulet

2017-12-29

Machine Learning and Knowledge Discovery in Databases (published)

doi.org

Dendritic error backpropagation in deep cortical microcircuits

João Sacramento

Rui Ponte Costa

Yoshua Bengio

Walter Senn

Animal behaviour depends on learning to associate sensory stimuli with the desired motor command. Understanding how the brain orchestrates t… (see more)he necessary synaptic modifications across different brain areas has remained a longstanding puzzle. Here, we introduce a multi-area neuronal network model in which synaptic plasticity continuously adapts the network towards a global desired output. In this model synaptic learning is driven by a local dendritic prediction error that arises from a failure to predict the top-down input given the bottom-up activities. Such errors occur at apical dendrites of pyramidal neurons where both long-range excitatory feedback and local inhibitory predictions are integrated. When local inhibition fails to match excitatory feedback an error occurs which triggers plasticity at bottom-up synapses at basal dendrites of the same pyramidal neurons. We demonstrate the learning capabilities of the model in a number of tasks and show that it approximates the classical error backpropagation algorithm. Finally, complementing this cortical circuit with a disinhibitory mechanism enables attention-like stimulus denoising and generation. Our framework makes several experimental predictions on the function of dendritic integration and cortical microcircuits, is consistent with recent observations of cross-area learning, and suggests a biological implementation of deep learning.

2017-12-29

ArXiv (preprint)

arxiv.org

Tensor Regression Networks with various Low-Rank Tensor Approximations

Xingwei Cao

Guillaume Rabusseau

Joelle Pineau

Tensor regression networks achieve high compression rate of neural networks while having slight impact on performances. They do so by imposi… (see more)ng low tensor rank structure on the weight matrices of fully connected layers. In recent years, tensor regression networks have been investigated from the perspective of their compressive power, however, the regularization effect of enforcing low-rank tensor structure has not been investigated enough. We study tensor regression networks using various low-rank tensor approximations, aiming to compare the compressive and regularization power of different low-rank constraints. We evaluate the compressive and regularization performances of the proposed model with both deep and shallow convolutional neural networks. The outcome of our experiment suggests the superiority of Global Average Pooling Layer over Tensor Regression Layer when applied to deep convolutional neural network with CIFAR-10 dataset. On the contrary, shallow convolutional neural networks with tensor regression layer and dropout achieved lower test error than both Global Average Pooling and fully-connected layer with dropout function when trained with a small number of samples.

2017-12-26

ArXiv (preprint)

arxiv.org

ObamaNet: Photo-realistic lip-sync from text

Rithesh Kumar

Jose Sotelo

Kundan Kumar

Alexandre De Brébisson

Yoshua Bengio

We present ObamaNet, the first architecture that generates both audio and synchronized photo-realistic lip-sync videos from any new text. Co… (see more)ntrary to other published lip-sync approaches, ours is only composed of fully trainable neural modules and does not rely on any traditional computer graphics methods. More precisely, we use three main modules: a text-to-speech network based on Char2Wav, a time-delayed LSTM to generate mouth-keypoints synced to the audio, and a network based on Pix2Pix to generate the video frames conditioned on the keypoints.

2017-12-05

ArXiv (preprint)

arxiv.org

Deep Learning @15 Petaflops/second: Semi-supervised pattern detection for 15 Terabytes of climate data

W. Collins

M. Wehner

M. Prabhat

Thorsten Kurth

Nadathur Satish

Ioannis Mitliagkas

Jie Zhang

Evan Racah

Md. Mostofa Ali Patwary

Narayanan Sundaram

Pradeep Dubey

2017-11-30

(published)

www.semanticscholar.org

Predicting Extubation Readiness in Extreme Preterm Infants based on Patterns of Breathing*

Charles C. Onu

Lara J. Kanbar

Wissam Shalish

Karen A. Brown

Guilherme M. Sant'Anna

Robert E. Kearney

Doina Precup

Extremely preterm infants commonly require intubation and invasive mechanical ventilation after birth. While the duration of mechanical vent… (see more)ilation should be minimized in order to avoid complications, extubation failure is associated with increases in morbidities and mortality. As part of a prospective observational study aimed at developing an accurate predictor of extubation readiness, Markov and semi-Markov chain models were applied to gain insight into the respiratory patterns of these infants, with more robust time-series modeling using semi-Markov models. This model revealed interesting similarities and differences between newborns who succeeded extubation and those who failed. The parameters of the model were further applied to predict extubation readiness via generative (joint likelihood) and discriminative (support vector machine) approaches. Results showed that up to 84\% of infants who failed extubation could have been accurately identified prior to extubation.

2017-11-30

2017 IEEE Symposium Series on Computational Intelligence (SSCI) (published)

doi.org

arxiv.org

Use machine learning to find energy materials.

Phil De Luna

Jennifer N. Wei

Yoshua Bengio

Al'an Aspuru-guzik

E. Sargent

2017-11-30

Nature (published)

doi.org

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications