Aaron Courville

Reza Bayat

PhD - Université de Montréal

Co-supervisor :

Pascal Vincent

Anirudh Buvanesh

PhD - Université de Montréal

Principal supervisor :

Laurent Charlin

Abhranil Chandra

Collaborating researcher - University of Waterloo

Master's Research - Université de Montréal

Juan Duque

PhD - Université de Montréal

PhD - Université de Montréal

Arian Hosseini

PhD - Université de Montréal

Amr Khalifa

PhD - Université de Montréal

Samuel Lavoie

PhD - Université de Montréal

Zhixuan Lin

PhD - Université de Montréal

Ahmed Masry

Collaborating researcher - N/A

Alan Milligan

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

PhD - Université de Montréal

Co-supervisor :

Rishabh Agarwal

Andrei Nicolicioiu

PhD - Université de Montréal

Evgenii Nikishin

Collaborating Alumni - Université de Montréal

Principal supervisor :

PhD - Université de Montréal

Johan Samir Obando Ceron

PhD - Université de Montréal

Co-supervisor :

Collaborating researcher - Université de Montréal

Dereck Piché

Master's Research - Université de Montréal

Khaled Rouissi

Master's Research - Université de Montréal

Esra'a Saleh

PhD - Université de Montréal

Principal supervisor :

Glen Berseth

Vedant Shah

PhD - Université de Montréal

PhD - Université de Montréal

Yusong Wu

PhD - Université de Montréal

Principal supervisor :

Anna (Cheng-Zhi) Huang

Sujin yun

PhD - Université de Montréal

Xiaofeng Zhang

PhD - Université de Montréal

Dinghuai Zhang

PhD - Université de Montréal

Co-supervisor :

Yoshua Bengio

Hattie Zhou

PhD - Université de Montréal

Principal supervisor :

Hugo Larochelle

Publications

Adversarially Learned Inference

Vincent Dumoulin

Ishmael Belghazi

Ben Poole

We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an ad… (see more)versarial process. The generation network maps samples from stochastic latent variables to the data space while the inference network maps training examples in data space to the space of latent variables. An adversarial game is cast between these two networks and a discriminative network is trained to distinguish between joint latent/data-space samples from the generative network and joint samples from the inference network. We illustrate the ability of the model to learn mutually coherent inference and generation networks through the inspections of model samples and reconstructions and confirm the usefulness of the learned representations by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks.

2017-01-01

ICLR.cc/2017/conference (poster)

Brain tumor segmentation with Deep Neural Networks

Pierre-Marc Jodoin

2017-01-01

Medical Image Analysis (published)

doi.org

Calibrating Energy-based Generative Adversarial Networks

Zihang Dai

Amjad Almahairi

Philip Bachman

Eduard Hovy

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specific… (see more)ally, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal. We derive the analytic form of the induced solution, and analyze the properties. In order to make the proposed framework trainable in practice, we introduce two effective approximation techniques. Empirically, the experiment results closely match our theoretical analysis, verifying the discriminator is able to recover the energy of data distribution.

2017-01-01

ICLR.cc/2017/conference (poster)

Facilitating Multimodality in Normalizing Flows

Chin-wei Huang

David Scott Krueger

The true Bayesian posterior of a model such as a neural network may be highly multimodal. In principle, normalizing flows can represent such… (see more) a distribution via compositions of invertible transformations of random noise. In practice, however, existing normalizing flows may fail to capture most of the modes of a distribution. We argue that the conditionally affine structure of the transformations used in [Dinh et al., 2014, 2016, Kingma et al., 2016] is inefficient, and show that flows which instead use (conditional) invertible non-linear transformations naturally enable multimodality in their output distributions. With just two layers of our proposed deep sigmoidal flow, we are able to model complicated 2d energy functions with much higher fidelity than six layers of deep affine flows.

GibbsNet: Iterative Adversarial Inference for Deep Graphical Models

Directed latent variable models that formulate the joint distribution as …

Improved Training of Wasserstein GANs

Generative Adversarial Networks (GANs) are powerful generative models, but suffer from training instability. The recently proposed Wasserste… (see more)in GAN (WGAN) makes progress toward stable training of GANs, but sometimes can still generate only low-quality samples or fail to converge. We find that these problems are often due to the use of weight clipping in WGAN to enforce a Lipschitz constraint on the critic, which can lead to undesired behavior. We propose an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input. Our proposed method performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning, including 101-layer ResNets and language models over discrete data. We also achieve high quality generations on CIFAR-10 and LSUN bedrooms.

Modulating early visual processing by language

Jérémie Mary

Olivier Pietquin

It is commonly assumed that language refers to high-level visual concepts while leaving low-level visual processing unaffected. This view do… (see more)minates the current literature in computational models for language-vision tasks, where visual and linguistic input are mostly processed independently before being fused into a single representation. In this paper, we deviate from this classic pipeline and propose to modulate the \emph{entire visual processing} by linguistic input. Specifically, we condition the batch normalization parameters of a pretrained residual network (ResNet) on a language embedding. This approach, which we call MOdulated RESnet (\MRN), significantly improves strong baselines on two visual question answering tasks. Our ablation study shows that modulating from the early stages of the visual processing is beneficial.

Piecewise Latent Variables for Neural Variational Text Processing

Iulian V. Serban

Alexander G. Ororbia II

Joelle Pineau

Advances in neural variational inference have facilitated the learning of powerful directed graphical models with continuous latent variable… (see more)s, such as variational autoencoders. The hope is that such models will learn to represent rich, multi-modal latent factors in real-world data, such as natural language text. However, current models often assume simplistic priors on the latent variables - such as the uni-modal Gaussian distribution - which are incapable of representing complex latent factors efficiently. To overcome this restriction, we propose the simple, but highly flexible, piecewise constant distribution. This distribution has the capacity to represent an exponential number of modes of a latent target distribution, while remaining mathematically tractable. Our results demonstrate that incorporating this new latent distribution into different models yields substantial improvements in natural language processing tasks such as document modeling and natural language generation for dialogue.

2017-01-01

Conference on Empirical Methods in Natural Language Processing (published)

doi.org

PixelVAE: A Latent Variable Model for Natural Images

Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representatio… (see more)n and model global structure well but have difficulty capturing small details. PixelCNN models details very well, but lacks a latent code and is difficult to scale for capturing large structures. We present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. Our model requires very few expensive autoregressive layers compared to PixelCNN and learns latent codes that are more compressed than a standard VAE while still capturing most non-trivial structure. Finally, we extend our model to a hierarchy of latent variables at different scales. Our model achieves state-of-the-art performance on binarized MNIST, competitive performance on 64 × 64 ImageNet, and high-quality samples on the LSUN bedrooms dataset.

2017-01-01

ICLR.cc/2017/conference (poster)

Recurrent Batch Normalization

We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works… (see more) only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.

2017-01-01

ICLR.cc/2017/conference (poster)

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

Jose Sotelo

In this paper we propose a novel model for unconditional audio generation task that generates one audio sample at a time. We show that our m… (see more)odel which profits from combining memory-less modules, namely autoregressive multilayer perceptron, and stateful recurrent neural networks in a hierarchical structure is de facto powerful to capture the underlying sources of variations in temporal domain for very long time on three datasets of different nature. Human evaluation on the generated samples indicate that our model is preferred over competing models. We also show how each component of the model contributes to the exhibited performance.

2017-01-01

ICLR.cc/2017/conference (poster)

Sequentialized Sampling Importance Resampling and Scalable IWAE

Chin-wei Huang