Publications

A Multisensor Multi-Bernoulli Filter

Augustin-Alexandru Saucan

Mark J. Coates

Michael G. Rabbat

In this paper, we derive a multisensor multi-Bernoulli (MS-MeMBer) filter for multitarget tracking. Measurements from multiple sensors are e… (voir plus)mployed by the proposed filter to update a set of tracks modeled as a multi-Bernoulli random finite set. An exact implementation of the MS-MeMBer update procedure is computationally intractable. We propose an efficient approximate implementation by using a greedy measurement partitioning mechanism. The proposed filter allows for Gaussian mixture or particle filter implementations. Numerical simulations conducted for both linear-Gaussian and nonlinear models highlight the improved accuracy of the MS-MeMBer filter and its reduced computational load with respect to the multisensor cardinalized probability hypothesis density filter and the iterated-corrector cardinality-balanced multi-Bernoulli filter especially for low probabilities of detection.

2017-10-14

IEEE Transactions on Signal Processing (publié)

doi.org

arxiv.org

Bayesian Hypernetworks

David M. Krueger

Chin-wei Huang

Riashat Islam

Ryan Turner

Alexandre Lacoste

Aaron Courville

We propose Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork, h, is a neura… (voir plus)l network which learns to transform a simple noise distribution, p(e) = N(0,I), to a distribution q(t) := q(h(e)) over the parameters t of another neural network (the ``primary network). We train q with variational inference, using an invertible h to enable efficient estimation of the variational lower bound on the posterior p(t | D) via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of q(t). In practice, Bayesian hypernets provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.

2017-10-12

ArXiv (prépublication)

arxiv.org

Learning Independent Features with Adversarial Nets for Non-linear ICA

Philemon Brakel

Yoshua Bengio

Reliable measures of statistical dependence could potentially be useful tools for learning independent features and performing tasks like so… (voir plus)urce separation using Independent Component Analysis (ICA). Unfortunately, many of such measures, like the mutual information, are hard to estimate and optimize directly. We propose to learn independent features with adversarial objectives (Goodfellow et al. 2014, Arjovsky et al. 2017) which optimize such measures implicitly. These objectives compare samples from the joint distribution and the product of the marginals without the need to compute any probability densities. We also propose two methods for obtaining samples from the product of the marginals using either a simple resampling trick or a separate parametric distribution. Our experiments show that this strategy can easily be applied to different types of model architectures and solve both linear and non-linear ICA problems.

2017-10-12

ArXiv (prépublication)

openreview.net

Learnable Explicit Density for Continuous Latent Space and Variational Inference

In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its correspon… (voir plus)ding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF) and show that with further improvement, inverse AF could be used as universal approximation to any complicated posterior. Our analysis results in a unified approach to parameterizing a VAE, without the need to restrict ourselves to use factorial Gaussians in the latent real space.

2017-10-05

ArXiv (prépublication)

arxiv.org

The Consciousness Prior

Yoshua Bengio

A new prior is proposed for learning representations of high-level concepts of the kind we manipulate with language. This prior can be combi… (voir plus)ned with other priors in order to help disentangling abstract factors from each other. It is inspired by cognitive neuroscience theories of consciousness, seen as a bottleneck through which just a few elements, after having been selected by attention from a broader pool, are then broadcast and condition further processing, both in perception and decision-making. The set of recently selected elements one becomes aware of is seen as forming a low-dimensional conscious state. This conscious state is combining the few concepts constituting a conscious thought, i.e., what one is immediately conscious of at a particular moment. We claim that this architectural and information-processing constraint corresponds to assumptions about the joint distribution between high-level concepts. To the extent that these assumptions are generally true (and the form of natural language seems consistent with them), they can form a useful prior for representation learning. A low-dimensional thought or conscious state is analogous to a sentence: it involves only a few variables and yet can make a statement with very high probability of being true. This is consistent with a joint distribution (over high-level concepts) which has the form of a sparse factor graph, i.e., where the dependencies captured by each factor of the factor graph involve only very few variables while creating a strong dip in the overall energy function. The consciousness prior also makes it natural to map conscious states to natural language utterances or to express classical AI knowledge in a form similar to facts and rules, albeit capturing uncertainty as well as efficient search mechanisms implemented by attention mechanisms.

2017-09-24

ArXiv (prépublication)

arxiv.org

Neural Network Based Nonlinear Weighted Finite Automata

Tianyu Li

Guillaume Rabusseau

Doina Precup

Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent succ… (voir plus)esses of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.

2017-09-12

ArXiv (prépublication)

arxiv.org

A Deep Reinforcement Learning Chatbot

Iulian V. Serban

Mathieu Germain

Michael Pieper

Nan Rosemary Ke

Sai Mudumba

Alexandre De Brébisson

Jose M. R. Sotelo

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon … (voir plus)Alexa Prize competition. MILABOT is capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language generation and retrieval models, including template-based models, bag-of-words models, sequence-to-sequence neural network and latent variable neural network models. By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble. The system has been evaluated through A/B testing with real-world users, where it performed significantly better than many competing systems. Due to its machine learning architecture, the system is likely to improve with additional data.

2017-09-06

ArXiv (prépublication)

doi.org

arxiv.org

Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation

Andrew Doyle

Doina Precup

Douglas Arnold

Tal Arbel

2017-09-03

Medical Image Computing and Computer Assisted Intervention − MICCAI 2017 (publié)

doi.org

Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation

Jesús Alejandro Cárdenes Cabré

Doina Precup

Ricardo Sanz

Over-booking or under-booking of computing resources leads to higher cost and performance degradation of web applications. To optimize the p… (voir plus)erformance of web applications, access to the resources has to be dynamically controlled ensuring maximum cost-performance ratio of the application while fulfilling requirements. To simplify the design of dynamic cloud controllers, we propose a horizontal and vertical scalability self-aware agent defined by a self-adaptive fuzzy logic with an oriented random optimizer based on reward and memory. The algorithm dynamically adjusts the membership functions and their relationship, maximizing the reward of the system while considering the cost related to the deployment of new resources. The evaluation of the controller under real cloud workload reveals the ability of the algorithm to maximize the performance of the web application based on the target parameters given by an operator.

2017-08-31

International Conference on Cloud and Autonomic Computing (publié)

doi.org

On integrating a language model into neural machine translation

2017-08-31

Computer Speech and Language (publié)

doi.org

Multi-way, multilingual neural machine translation

Orhan Firat

Kyunghyun Cho

Baskaran Sankaran

F. Yarman-Vural

Yoshua Bengio

2017-08-31

Computer Speech and Language (publié)

doi.org

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions

Teng Long

Emmanuel Bengio

Ryan Lowe

Jackie CK Cheung

Doina Precup

Humans interpret texts with respect to some background information, or world knowledge, and we would like to develop automatic reading compr… (voir plus)ehension systems that can do the same. In this paper, we introduce a task and several models to drive progress towards this goal. In particular, we propose the task of rare entity prediction: given a web document with several entities removed, models are tasked with predicting the correct missing entities conditioned on the document context and the lexical resources. This task is challenging due to the diversity of language styles and the extremely large number of rare entities. We propose two recurrent neural network architectures which make use of external knowledge in the form of entity descriptions. Our experiments show that our hierarchical LSTM model performs significantly better at the rare entity prediction task than those that do not make use of external resources.

2017-08-31

Conference on Empirical Methods in Natural Language Processing (publié)

doi.org

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Publications

Mila sur Udemy

Publications du Fellowship en politiques de l'IA

La plateforme Mila Ventures

Mots-clés populaires:

Publications