Publications

Towards more hardware-friendly deep learning

Yoshua Bengio

2017-06-23

TiML '17 (published)

doi.org

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control

Policy gradient methods in reinforcement learning have become increasingly prevalent for state-of-the-art performance in continuous control … (see more)tasks. Novel methods typically benchmark against a few key algorithms such as deep deterministic policy gradients and trust region policy optimization. As such, it is important to present and use consistent baselines experiments. However, this can be difficult due to general variance in the algorithms, hyper-parameter tuning, and environment stochasticity. We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results. We provide guidelines on reporting novel results as comparisons against baseline methods such that future researchers can make informed decisions when investigating novel methods.

2017-06-16

ICML.cc/2017/RML (poster)

openreview.net

Time-Varying Mixtures of Markov Chains: An Application to Road Traffic Modeling

Sean Lawlor

Michael G. Rabbat

Time-varying mixture models are useful for representing complex, dynamic distributions. Components in the mixture model can appear and disap… (see more)pear, and persisting components can evolve. This allows great flexibility in streaming data applications where the model can be adjusted as new data arrives. Fitting a mixture model with computational guarantees which can meet real-time requirements is challenging with existing algorithms, especially when the model order can vary with time. Existing approximate inference methods may require multiple restarts to search for a good local solution. Monte-Carlo methods can be used to jointly estimate the model order and model parameters, but when the distribution of each mixand has a high-dimensional parameter space, they suffer from the curse of dimensionality and and from slow convergence. This paper proposes a generative model for time-varying mixture models, tailored for mixtures of discrete-time Markov chains. A novel, deterministic inference procedure is introduced and is shown to be suitable for applications requiring real-time estimation, and the method is guaranteed to converge at each time step. As a motivating application, we model and predict traffic patterns in a transportation network. Experiments illustrate the performance of the scheme and offer insights regarding tuning of the algorithm parameters. The experiments also investigate the predictive power of the proposed model compared to less complex models and demonstrate the superiority of the mixture model approach for prediction of traffic routes in real data.

2017-06-14

IEEE Transactions on Signal Processing (published)

doi.org

Learning to Compute Word Embeddings on the Fly

Dzmitry Bahdanau

Tom Bosc

Stanisław Jastrzębski

Edward Grefenstette

Pascal Vincent

Yoshua Bengios

Words in natural language follow a Zipfian distribution whereby some words are frequent but most are rare. Learning representations for word… (see more)s in the "long tail" of this distribution requires enormous amounts of data. Representations of rare words trained directly on end tasks are usually poor, requiring us to pre-train embeddings on external data, or treat all rare words as out-of-vocabulary words with a unique representation. We provide a method for predicting embeddings of rare words on the fly from small amounts of auxiliary data with a network trained end-to-end for the downstream task. We show that this improves results against baselines where embeddings are trained on the end task for reading comprehension, recognizing textual entailment and language modeling.

2017-05-31

ArXiv (preprint)

doi.org

openreview.net

Deep Learning for Patient-Specific Kidney Graft Survival Analysis

Margaux Luck

Tristan Sylvain

Heloise Cardinal

Andrea Lodi

Yoshua Bengio

An accurate model of patient-specific kidney graft survival distributions can help to improve shared-decision making in the treatment and ca… (see more)re of patients. In this paper, we propose a deep learning method that directly models the survival function instead of estimating the hazard function to predict survival times for graft patients based on the principle of multi-task learning. By learning to jointly predict the time of the event, and its rank in the cox partial log likelihood framework, our deep learning approach outperforms, in terms of survival time prediction quality and concordance index, other common methods for survival analysis, including the Cox Proportional Hazards model and a network trained on the cox partial log-likelihood.

2017-05-28

ArXiv (preprint)

arxiv.org

Deep Complex Networks

Christopher Pal

2017-05-26

ArXiv (preprint)

openreview.net

Implementation of Sparse Superposition Codes

Carlo Condo

Warren J. Gross

Sparse superposition codes (SSCs) are capacity achieving codes whose decoding process is a linear sensing problem. Decoding approaches thus … (see more)exploit the approximate message passing algorithm, which has been proven to be effective in compressing sensing. Previous work from the authors has evaluated the error correction performance of SSCs under finite precision and finite code length. This paper proposes the first SSC encoder and decoder architectures in the literature. The architectures are parametrized and applicable to all SSCs: A set of wide-ranging case studies is then considered, and code-specific approximations, along with implementation results in 65 nm CMOS technology, are then provided. The encoding process can be carried out with low power consumption (≤2.103 mW), while the semi-parallel decoder architecture can reach a throughput of 1.3 Gb/s with a 768 × 6-bit SSC codeword and an area occupation of 2.43 mm2.

2017-04-30

IEEE Transactions on Signal Processing (published)

doi.org

Multi-Modal Variational Encoder-Decoders

Iulian V. Serban

Alexander G. Ororbia II

Joelle Pineau

Aaron Courville

2017-04-23

arXiv.org (preprint)

openreview.net

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

Jean Harb

Doina Precup

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowl… (see more)edge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highlight also the importance of the optimization used in the training.

2017-04-17

ArXiv (preprint)

openreview.net

RATM: Recurrent Attentive Tracking Model

Samira Ebrahimi Kahou

Vincent Michalski

Roland Memisevic

Christopher Pal

Vincent Michalski

We present an attention-based modular neural framework for computer vision. The framework uses a soft attention mechanism allowing models to… (see more) be trained with gradient descent. It consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior. The attention module allows the model to focus computation on task-related information in the input. We apply the framework to several object tracking tasks and explore various design choices. We experiment with three data sets, bouncing ball, moving digits and the real-world KTH data set. The proposed Recurrent Attentive Tracking Model performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging tracking data set.

2017-04-11

cv-foundation.org/CVPR/2017/BNMW (unknown)

doi.org

openreview.net

A Sparse Probabilistic Model of User Preference Data

Matthew J. A. Smith

Laurent Charlin

Joelle Pineau

2017-04-10

Advances in Artificial Intelligence (published)

doi.org

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

Peeyush T. Kumar

Doina Precup

Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have som… (see more)ewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.

2017-03-18

ArXiv (preprint)

arxiv.org

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Publications

TRAIL: Responsible AI for Professionals and Leaders

Mila Ventures Founder in Residence

AI Advantage: Productivity in Public Service

Popular keywords:

Publications