Publications

Almost all neural architecture search methods are evaluated in terms of performance (i.e. test accuracy) of the model structures that it fin… (voir plus)ds. Should it be the only metric for a good autoML approach? To examine aspects beyond performance, we propose a set of criteria aimed at evaluating the core of autoML problem: the amount of human intervention required to deploy these methods into real world scenarios. Based on our proposed evaluation checklist, we study the effectiveness of a random search strategy for fully automated multimodal neural architecture search. Compared to traditional methods that rely on manually crafted feature extractors, our method selects each modality from a large search space with minimal human supervision. We show that our proposed random search strategy performs close to the state of the art on the AV-MNIST dataset while meeting the desirable characteristics for a fully automated design process.

2020-03-02

ArXiv (prépublication)

Tensor Networks for Language Modeling

Jacob Miller

John Anthony Terilla

The tensor network formalism has enjoyed over two decades of success in modeling the behavior of complex quantum-mechanical systems, but has… (voir plus) only recently and sporadically been leveraged in machine learning. Here we introduce a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We identify several distinctive features of this recurrent generative model, notably the ability to condition or marginalize sampling on characters at arbitrary locations within a sequence, with no need for approximate sampling methods. Despite the sequential architecture of u-MPS, we show that a recursive evaluation algorithm can be used to parallelize its inference and training, with a string of length n only requiring parallel time

2020-03-02

ArXiv (prépublication)

Tensor Networks for Probabilistic Sequence Modeling

Jacob Miller

John Anthony Terilla

Tensor networks are a powerful modeling framework developed for computational many-body physics, which have only recently been applied withi… (voir plus)n machine learning. In this work we utilize a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We first show that u-MPS enable sequence-level parallelism, with length-n sequences able to be evaluated in depth O(log n). We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample from a wide variety of conditional distributions, each one defined by a regular expression. Special cases of this algorithm correspond to autoregressive and fill-in-the-blank sampling, but more complex regular expressions permit the generation of richly structured text in a manner that has no direct analogue in current generative models. Experiments on synthetic text data find u-MPS outperforming LSTM baselines in several sampling tasks, and demonstrate strong generalization in the presence of limited data.

2020-03-02

International Conference on Artificial Intelligence and Statistics (publié)

proceedings.mlr.press

Tensor Networks for Probabilistic Sequence Modeling

Jacob Miller

John Anthony Terilla

Tensor networks are a powerful modeling framework developed for computational many-body physics, which have only recently been applied withi… (voir plus)n machine learning. In this work we utilize a uniform matrix product state (u-MPS) model for probabilistic modeling of sequence data. We first show that u-MPS enable sequence-level parallelism, with length-n sequences able to be evaluated in depth O(log n). We then introduce a novel generative algorithm giving trained u-MPS the ability to efficiently sample from a wide variety of conditional distributions, each one defined by a regular expression. Special cases of this algorithm correspond to autoregressive and fill-in-the-blank sampling, but more complex regular expressions permit the generation of richly structured text in a manner that has no direct analogue in current generative models. Experiments on synthetic text data find u-MPS outperforming LSTM baselines in several sampling tasks, and demonstrate strong generalization in the presence of limited data.

2020-03-02

International Conference on Artificial Intelligence and Statistics (published)

dblp.uni-trier.de

Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting

Di Wu

Boyu Wang

Doina Precup

Benoit Boulet

Electric load forecasting, especially short-term load forecasting (STLF), is becoming more and more important for power system operation. We… (voir plus) propose to use multiple kernel learning (MKL) for residential electric load forecasting which provides more flexibility than traditional kernel methods. Computation time is an important issue for short-term forecasting, especially for energy scheduling. However, conventional MKL methods usually lead to complicated optimization problems. Another practical issue for this application is that there may be a very limited amount of data available to train a reliable forecasting model for a new house, while at the same time we may have historical data collected from other houses which can be leveraged to improve the prediction performance for the new house. In this paper, we propose a boosting-based framework for MKL regression to deal with the aforementioned issues for STLF. In particular, we first adopt boosting to learn an ensemble of multiple kernel regressors and then extend this framework to the context of transfer learning. Furthermore, we consider two different settings: homogeneous transfer learning and heterogeneous transfer learning. Experimental results on residential data sets demonstrate that forecasting error can be reduced by a large margin with the knowledge learned from other houses.

2020-03-01

IEEE Transactions on Smart Grid (publié)

Seven pillars of precision digital health and medicine

Arash Shaban-Nejad

Martin Michalowski

Niels Peek

John S. Brownstein

David Buckeridge

2020-03-01

Artificial Intelligence in Medicine (publié)

On the Morality of Artificial Intelligence

Alexandra Luccioni

Yoshua Bengio

Examines ethical principles and guidelines that surround machine learning and artificial intelligence.

2020-03-01

IEEE Technology and Society Magazine (publié)

On Catastrophic Interference in Atari 2600 Games

William Fedus

Dibya Ghosh

John D. Martin

Marc Gendron-Bellemare

Yoshua Bengio

Hugo Larochelle

Model-free deep reinforcement learning is sample inefficient. One hypothesis -- speculated, but not confirmed -- is that catastrophic interf… (voir plus)erence within an environment inhibits learning. We test this hypothesis through a large-scale empirical study in the Arcade Learning Environment (ALE) and, indeed, find supporting evidence. We show that interference causes performance to plateau; the network cannot train on segments beyond the plateau without degrading the policy used to reach there. By synthetically controlling for interference, we demonstrate performance boosts across architectures, learning algorithms and environments. A more refined analysis shows that learning one segment of a game often increases prediction errors elsewhere. Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.

2020-02-28

ArXiv (prépublication)

Machine learning analysis of exome trios to contrast the genomic architecture of autism and schizophrenia

Sameer Sardaar

Bill Qi

Alexandre Dionne-Laporte

Guy. A. Rouleau

Reihaneh Rabbany

Yannis Trakadis

2020-02-28

BMC Psychiatry (publié)

Policy Evaluation Networks

Jean Harb

Tom Schaul

Doina Precup

Pierre-Luc Bacon

2020-02-26

ArXiv (prépublication)