Publications

Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Amjad Almahairi
Christof Angermüller
Nicolas Ballas
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
J. Bergstra
Valentin Bisson
Josh Bleecher Snyder
Nicolas Bouchard
Nicolas Boulanger-Lewandowski
Alexandre De Brébisson
Kyunghyun Cho
Jan Chorowski
Paul F. Christiano
Tim Cooijmans
Marc-Alexandre Côté
Myriam Côté
Yann Dauphin
Olivier Delalleau
Julien Demouth
Guillaume Desjardins
Sander Dieleman
Laurent Dinh
M'elanie Ducoffe
Vincent Dumoulin
Dumitru Erhan
Ziye Fan
Orhan Firat
Mathieu Germain
Xavier Glorot
Ian J. Goodfellow
Matthew Graham
Caglar Gulcehre
Philippe Hamel
Iban Harlouchet
Jean-philippe Heng
Balázs Hidasi
Sina Honari
Arjun Jain
S'ebastien Jean
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Alex Lamb
Pascal Lamblin
Eric P. Larsen
César Laurent
S. Lee
Simon-mark Lefrancois
Simon Lemieux
Nicholas Léonard
Zhouhan Lin
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
Pierre-Antoine Manzagol
Olivier Mastropietro
R. McGibbon
Roland Memisevic
Bart van Merriënboer
Mehdi Mirza
Alberto Orlandi
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
François Savard
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Dmitriy Serdyuk
Samira Shabanian
Etienne Simon
Sigurd Spieckermann
S. Subramanyam
Jakub Sygnowski
Jérémie Tanguay
Gijs van Tulder
Joseph P. Turian
Sebastian Urban
Francesco Visin
Harm de Vries
David Warde-Farley
Dustin J. Webb
M. Willson
Kelvin Xu
Lijun Xue
Li Yao
Saizheng Zhang
Ying Zhang
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Theano: A Python framework for fast computation of mathematical expressions
Rami Al-rfou'
Amjad Almahairi
Christof Angermüller
Nicolas Ballas
Frédéric Bastien
Justin S. Bayer
A. Belikov
A. Belopolsky
James Bergstra
Valentin Bisson
Josh Bleecher Snyder
Nicolas Bouchard
Nicolas Boulanger-Lewandowski
Alexandre De Brébisson
Kyunghyun Cho
Jan Chorowski
Paul F. Christiano
Tim Cooijmans
Marc-Alexandre Côté
Myriam Côté
Yann Dauphin
Olivier Delalleau
Julien Demouth
Guillaume Desjardins
Sander Dieleman
Laurent Dinh
M'elanie Ducoffe
Vincent Dumoulin
Dumitru Erhan
Ziye Fan
Orhan Firat
Mathieu Germain
Xavier Glorot
Ian G Goodfellow
Matthew Graham
Caglar Gulcehre
Philippe Hamel
Iban Harlouchet
Jean-philippe Heng
Balázs Hidasi
Sina Honari
Arjun Jain
Sébastien Jean
Kai Jia
Mikhail V. Korobov
Vivek Kulkarni
Alex Lamb
Pascal Lamblin
Eric Larsen
César Laurent
S. Lee
Simon-mark Lefrancois
Simon Lemieux
Nicholas Léonard
Zhouhan Lin
J. Livezey
Cory R. Lorenz
Jeremiah L. Lowin
Qianli M. Ma
Pierre-Antoine Manzagol
Olivier Mastropietro
R. McGibbon
Roland Memisevic
Bart van Merriënboer
Mehdi Mirza
Alberto Orlandi
Colin Raffel
Daniel Renshaw
Matthew David Rocklin
Markus Dr. Roth
Peter Sadowski
John Salvatier
François Savard
Jan Schlüter
John D. Schulman
Gabriel Schwartz
Iulian V. Serban
Dmitriy Serdyuk
Samira Shabanian
Etienne Simon
Sigurd Spieckermann
S. Subramanyam
Jakub Sygnowski
Jérémie Tanguay
Gijs van Tulder
Joseph Turian
Sebastian Urban
Francesco Visin
Harm de Vries
David Warde-Farley
Dustin J. Webb
M. Willson
Kelvin Xu
Lijun Xue
Li Yao
Saizheng Zhang
Ying Zhang
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficie… (voir plus)ntly. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Sequential data often possesses hierarchical structures with complex dependencies between sub-sequences, such as found between the utterance… (voir plus)s in a dialogue. To model these dependencies in a generative framework, we propose a neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with other recent neural-network architectures. We evaluate the model performance through a human evaluation study. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state.
Nash equilibria in the two-player kidney exchange game
Andrea Lodi
João Pedro Pedroso
Ana Luiza D'ávila Viana
Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models
We investigate the task of building open domain, conversational dialogue systems based on large dialogue corpora using generative models. Ge… (voir plus)nerative models produce system responses that are autonomously generated word-by-word, opening up the possibility for realistic, flexible interactions. In support of this goal, we extend the recently proposed hierarchical recurrent encoder-decoder neural network to the dialogue domain, and demonstrate that this model is competitive with state-of-the-art neural language models and back-off n-gram models. We investigate the limitations of this and similar approaches, and show how its performance can be improved by bootstrapping the learning from a larger question-answer pair corpus and from pretrained word embeddings.
Task Loss Estimation for Structured Prediction
D. Serdyuk
Philemon Brakel
Nan Rosemary Ke
Jan Chorowski
Fault-Tolerant Associative Memories Based on $c$-Partite Graphs
François Leduc-Primeau
Vincent Gripon
Associative memories allow the retrieval of previously stored messages given a part of their content. In this paper, we are interested in as… (voir plus)sociative memories based on c-partite graphs that were recently introduced. These memories are almost optimal in terms of the amount of storage they require (efficiency) and allow retrieving messages with low complexity. We propose a generic implementation model for the retrieval algorithm that can be readily mapped to an integrated circuit and study the retrieval performance when hardware components are affected by faults. We show using analytical and simulation results that these associative memories can be made resilient to circuit faults with a minor modification of the retrieval algorithm. In one example, the memory retains 88% of its efficiency when 1% of the storage cells are faulty, or 98% when 0.1% of the binary outputs of the retrieval algorithm are faulty. When considering storage faults, the fault tolerance exhibited by the proposed associative memory can be comparable to using a capacity-achieving error correction code for protecting the stored information.
Fault-Tolerant Associative Memories Based on $c$-Partite Graphs
François Leduc-Primeau
Vincent Gripon
Associative memories allow the retrieval of previously stored messages given a part of their content. In this paper, we are interested in as… (voir plus)sociative memories based on c-partite graphs that were recently introduced. These memories are almost optimal in terms of the amount of storage they require (efficiency) and allow retrieving messages with low complexity. We propose a generic implementation model for the retrieval algorithm that can be readily mapped to an integrated circuit and study the retrieval performance when hardware components are affected by faults. We show using analytical and simulation results that these associative memories can be made resilient to circuit faults with a minor modification of the retrieval algorithm. In one example, the memory retains 88% of its efficiency when 1% of the storage cells are faulty, or 98% when 0.1% of the binary outputs of the retrieval algorithm are faulty. When considering storage faults, the fault tolerance exhibited by the proposed associative memory can be comparable to using a capacity-achieving error correction code for protecting the stored information.
Former NASA chief unveils $ 100 million neural chip maker KnuEdge
C. Strasser
Dean Takahashi
Tim Klinger
Gerald Tesauro
Kartik Talamadupula
Bowen Zhou
Medium, Moore Data, Carly Strasser from June 07, 2016 Open access to research articles has been in the news quite a bit lately (see the SciH… (voir plus)ub controversy, the preprints in biology discussion, and the European Union’s recent announcement). The Data-Driven Discovery team at the Moore Foundation has also been discussing open access, particularly as it relates to the publications generated by our #MooreData researchers. Our grantee population is fairly progressive when it comes to open science, and many of the outputs that they generate are already publicly available (including proposals, software, workflows, and publications). It is therefore easy for us to imagine that they would embrace a policy that mandates open access for research articles that they produce. That said, we are always open to discussions!
Medical Computer Vision and Bayesian and Graphical Models for Biomedical Imaging
Henning Müller
B. Kelm
Weidong (Tom) Cai
M. Jorge Cardoso
Georg Langs
Bjoern Menze
Dimitris N. Metaxas
Albert A. Montillo
William Wells
Shaoting Zhang
Albert C.S. Chung
M. Jenkinson
Annemie Ribbens
Professor Forcing: A New Algorithm for Training Recurrent Networks
Anirudh Goyal
Alex Lamb
Ying Zhang
Saizheng Zhang
The Teacher Forcing algorithm trains recurrent networks by supplying observed sequence values as inputs during training and using the networ… (voir plus)k’s own one-step-ahead predictions to do multi-step sampling. We introduce the Professor Forcing algorithm, which uses adversarial domain adaptation to encourage the dynamics of the recurrent network to be the same when training the network and when sampling from the network over multiple time steps. We apply Professor Forcing to language modeling, vocal synthesis on raw waveforms, handwriting generation, and image generation. Empirically we find that Professor Forcing acts as a regularizer, improving test likelihood on character level Penn Treebank and sequential MNIST. We also find that the model qualitatively improves samples, especially when sampling for a large number of time steps. This is supported by human evaluation of sample quality. Trade-offs between Professor Forcing and Scheduled Sampling are discussed. We produce T-SNEs showing that Professor Forcing successfully makes the dynamics of the network during training and sampling more similar.
Poisson Group Testing: A Probabilistic Model for Boolean Compressed Sensing
Olgica Milenkovic
We introduce a novel probabilistic group testing framework, termed Poisson group testing, in which the number of defectives follows a right-… (voir plus)truncated Poisson distribution. The Poisson model has a number of new applications, including dynamic testing with diminishing relative rates of defectives. We consider both nonadaptive and semi-adaptive identification methods. For nonadaptive methods, we derive a lower bound on the number of tests required to identify the defectives with a probability of error that asymptotically converges to zero; in addition, we propose test matrix constructions for which the number of tests closely matches the lower bound. For semiadaptive methods, we describe a lower bound on the expected number of tests required to identify the defectives with zero error probability. In addition, we propose a stage-wise reconstruction algorithm for which the expected number of tests is only a constant factor away from the lower bound. The methods rely only on an estimate of the average number of defectives, rather than on the individual probabilities of subjects being defective.