Publications

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation

Iulian V. Serban

Tim Klinger

Gerald Tesauro

Kartik Talamadupula

Bowen Zhou

We introduce a new class of models called multiresolution recurrent neural networks, which explicitly model natural language generation at m… (see more)ultiple levels of abstraction. The models extend the sequence-to-sequence framework to generate two parallel stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language words (e.g. sentences). The coarse sequences follow a latent stochastic process with a factorial representation, which helps the models generalize to new examples. The coarse sequences can also incorporate task-specific knowledge, when available. In our experiments, the coarse sequences are extracted using automatic procedures, which are designed to capture compositional structure and semantics. These procedures enable training the multiresolution recurrent neural networks by maximizing the exact joint log-likelihood over both sequences. We apply the models to dialogue response generation in the technical support domain and compare them with several competing models. The multiresolution recurrent neural networks outperform competing models by a substantial margin, achieving state-of-the-art results according to both a human evaluation study and automatic evaluation metrics. Furthermore, experiments show the proposed models generate more fluent, relevant and goal-oriented responses.

2017-02-12

Proceedings of the AAAI Conference on Artificial Intelligence (published)

doi.org

arxiv.org

Movie Description

Anna Rohrbach

Atousa Torabi

Marcus Rohrbach

Niket Tandon

Chris Pal

Hugo Larochelle

Aaron Courville

Bernt Schiele

2017-01-25

International Journal of Computer Vision (published)

doi.org

arxiv.org

Training End-to-End Dialogue Systems with the Ubuntu Dialogue Corpus

Ryan Thomas Lowe

Nissan Pow

Iulian V. Serban

Laurent Charlin

Chia-Wei Liu

Joelle Pineau

In this paper, we construct and train end-to-end neural network-based dialogue systems using an updated version of the recent Ubuntu Dialogu… (see more)e Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This dataset is interesting because of its size, long context lengths, and technical nature; thus, it can be used to train large models directly from data with minimal feature engineering, which can be both time consuming and expensive. We provide baselines in two different environments: one where models are trained to maximize the log-likelihood of a generated utterance conditioned on the context of the conversation, and one where models are trained to select the correct next response from a list of candidate responses. These are both evaluated on a recall task that we call Next Utterance Classification (NUC), as well as other generation-specific metrics. Finally, we provide a qualitative error analysis to help determine the most promising directions for future research on the Ubuntu Dialogue Corpus, and for end-to-end dialogue systems in general.

2017-01-20

Dialogue & Discourse (published)

doi.org

An Actor-Critic Algorithm for Sequence Prediction

Dzmitry Bahdanau

Philemon Brakel

Kelvin Xu

Anirudh Goyal

Ryan Lowe

Joelle Pineau

Aaron Courville

Yoshua Bengio

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL). Curren… (see more)t log-likelihood training methods are limited by the discrepancy between their training and testing modes, as models must generate tokens conditioned on their previous guesses rather than the ground-truth tokens. We address this problem by introducing a textit{critic} network that is trained to predict the value of an output token, given the policy of an textit{actor} network. This results in a training procedure that is much closer to the test phase, and allows us to directly optimize for a task-specific score such as BLEU. Crucially, since we leverage these techniques in the supervised learning setting rather than the traditional RL setting, we condition the critic network on the ground-truth output. We show that our method leads to improved performance on both a synthetic task, and for German-English machine translation. Our analysis paves the way for such methods to be applied in natural language generation tasks, such as machine translation, caption generation, and dialogue modelling.

2017-01-01

ICLR.cc/2017/conference (poster)

openreview.net

Adversarially Learned Inference

Vincent Dumoulin

Ishmael Belghazi

Ben Poole

Alex Lamb

Martin Arjovsky

Olivier Mastropietro

Aaron Courville

We introduce the adversarially learned inference (ALI) model, which jointly learns a generation network and an inference network using an ad… (see more)versarial process. The generation network maps samples from stochastic latent variables to the data space while the inference network maps training examples in data space to the space of latent variables. An adversarial game is cast between these two networks and a discriminative network is trained to distinguish between joint latent/data-space samples from the generative network and joint samples from the inference network. We illustrate the ability of the model to learn mutually coherent inference and generation networks through the inspections of model samples and reconstructions and confirm the usefulness of the learned representations by obtaining a performance competitive with state-of-the-art on the semi-supervised SVHN and CIFAR10 tasks.

2017-01-01

ICLR.cc/2017/conference (poster)

openreview.net

BOUNDS LEAD TO IMPROVED CLASSIFIERS

Nicolas Le Roux

The standard approach to supervised classification involves the minimization of a log-loss as an upper bound to the classification error. Wh… (see more)ile this is a tight bound early on in the optimization, it overemphasizes the influence of incorrectly classified examples far from the decision boundary. Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems. In addition, in the context where the classifier is part of a larger system, this modification makes it possible to link the performance of the classifier to that of the whole system, allowing the seamless introduction of external constraints.

Brain tumor segmentation with Deep Neural Networks

Mohammad Havaei

Axel Davy

David Warde-Farley

Antoine Biard

Pierre-Marc Jodoin

2017-01-01

Medical Image Analysis (published)

doi.org

arxiv.org

Calibrating Energy-based Generative Adversarial Networks

Zihang Dai

Amjad Almahairi

Philip Bachman

Eduard Hovy

Aaron Courville

In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specific… (see more)ally, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal. We derive the analytic form of the induced solution, and analyze the properties. In order to make the proposed framework trainable in practice, we introduce two effective approximation techniques. Empirically, the experiment results closely match our theoretical analysis, verifying the discriminator is able to recover the energy of data distribution.

2017-01-01

ICLR.cc/2017/conference (poster)

openreview.net

Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures

M. Cardoso

Tal Arbel

Xiongbiao Luo

Stefan Wesarg

Tobias Reichl

M. Ballester

Jonathan Mcleod

Klaus Dr. Drechsler

T. Peters

Marius Erdt

Kensaku Mori

M. Linguraru

Andreas Uhl

Cristina Oyarzun Laura

R. Shekhar

2017-01-01

Lecture Notes in Computer Science (published)

doi.org

Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures

M. Jorge Cardoso

Tal Arbel

Xiongbiao Luo

Stefan Wesarg

Tobias Reichl

M. Ballester

Jonathan Mcleod

Klaus Dr. Drechsler

T. Peters

Marius Erdt

Kensaku Mori

M. Linguraru

Andreas Uhl

Cristina Oyarzun Laura

R. Shekhar

2017-01-01

Lecture Notes in Computer Science (published)

doi.org

Computer-Assisted Conceptual Analysis of Textual Data as Applied to Philosophical Corpuses

Jean Guy Meunier

L. Chartrand

Jackie Cheung

Mathieu Valette

Marie-noëlle Bayle

2017-01-01

DH (published)

dblp.uni-trier.de

Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support

M. Jorge Cardoso

Tal Arbel

G. Carneiro

T. Syeda-Mahmood

J. Tavares

Mehdi Moradi

Andrew P. Bradley

Hayit Greenspan

J. Papa

Anant. Madabhushi

Jacinto C Nascimento

Jaime S. Cardoso

Vasileios Belagiannis

Zhi Lu

Faculdade Engenharia

2017-01-01

Lecture Notes in Computer Science (published)

doi.org

arxiv.org

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Publications

AI Advantage

Leveraging AI for a Sustainable Future

Mila AI Policy Fellowship

AI Advantage

Leveraging AI for a Sustainable Future

Popular keywords:

Publications