Publications

Online continual learning with no task boundaries

Continual learning is the ability of an agent to learn online with a non-stationary and never-ending stream of data. A key component for suc… (voir plus)h never-ending learning process is to overcome the catastrophic forgetting of previously seen data, a problem that neural networks are well known to suffer from. The solutions developed so far often relax the problem of continual learning to the easier task-incremental setting, where the stream of data is divided into tasks with clear boundaries. In this paper, we break the limits and move to the more challenging online setting where we assume no information of tasks in the data stream. We start from the idea that each learning step should not increase the losses of the previously learned examples through constraining the optimization process. This means that the number of constraints grows linearly with the number of examples, which is a serious limitation. We develop a solution to select a ﬁxed number of constraints that we use to approximate the feasible region deﬁned by the original constraints. We compare our approach against the methods that rely on task boundaries to select a ﬁxed set of examples, and show comparable or even better results, especially when the boundaries are blurry or when the data distributions are imbalanced.

2019-03-19

arXiv.org (prépublication)

dblp.uni-trier.de

Counterpoint by Convolution

Cheng-Zhi Anna Huang

Tim Cooijmans

Adam Roberts

Aaron Courville

Douglas Eck

Machine learning models of music typically break down the task of composition into a chronological process, composing a piece of music in a … (voir plus)single pass from beginning to end. On the contrary, human composers write music in a nonlinear fashion, scribbling motifs here and there, often revisiting choices previously made. We explore the use of blocked Gibbs sampling as an analogue to the human approach, and introduce Coconet, a convolutional neural network in the NADE family of generative models. Despite ostensibly sampling from the same distribution as the NADE ancestral sampling procedure, we find that a blocked Gibbs approach significantly improves sample quality. We provide evidence that this is due to some conditional distributions being poorly modeled. Moreover, we show that even the cheap approximate blocked Gibbs procedure from Yao et al. (2014) yields better samples than ancestral sampling. We demonstrate the versatility of our method on unconditioned polyphonic music generation.

2019-03-17

ArXiv (prépublication)

openreview.net

BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices

Konrad Wagstyl

Stéphanie Larocque

Guillem Cucurull

Claude Lepage

Joseph Paul Cohen

Sebastian Bludau

Nicola Palomero-Gallagher

Lindsay B. Lewis

Thomas Funck

Hannah Spitzer

Timo Dickscheid

Paul C. Fletcher

Adriana Romero

Karl Zilles

Katrin Amunts

Yoshua Bengio

Alan C. Evans

Histological atlases of the cerebral cortex, such as those made famous by Brodmann and von Economo, are invaluable for understanding human b… (voir plus)rain microstructure and its relationship with functional organization in the brain. However, these existing atlases are limited to small numbers of manually annotated samples from a single cerebral hemisphere, measured from 2D histological sections. We present the first whole-brain quantitative 3D laminar atlas of the human cerebral cortex. This atlas was derived from a 3D histological model of the human brain at 20 micron isotropic resolution (BigBrain), using a convolutional neural network to segment, automatically, the cortical layers in both hemispheres. Our approach overcomes many of the historical challenges with measurement of histological thickness in 2D and the resultant laminar atlas provides an unprecedented level of precision and detail. We utilized this BigBrain cortical atlas to test whether previously reported thickness gradients, as measured by MRI in sensory and motor processing cortices, were present in a histological atlas of cortical thickness, and which cortical layers were contributing to these gradients. Cortical thickness increased across sensory processing hierarchies, primarily driven by layers III, V and VI. In contrast, fronto-motor cortices showed the opposite pattern, with decreases in total and pyramidal layer thickness. These findings illustrate how this laminar atlas will provide a link between single-neuron morphology, mesoscale cortical layering, macroscopic cortical thickness and, ultimately, functional neuroanatomy.

2019-03-16

bioRxiv (prépublication)

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future

Nan Rosemary Ke

Amanpreet Singh

Ahmed Touati

Anirudh Goyal

Yoshua Bengio

Devi Parikh

Dhruv Batra

In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably inte… (voir plus)rtwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner's solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a variety of tasks and environments in both the imitation learning and model-based reinforcement learning settings.

2019-03-04

ArXiv (prépublication)

Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials

Hossein Aboutalebi

Tibor Schuster

The stochastic multi-armed bandit problem is a well-known model for studying the exploration-exploitation trade-off. It has significant poss… (voir plus)ible applications in adaptive clinical trials, which allow for dynamic changes in the treatment allocation probabilities of patients. However, most bandit learning algorithms are designed with the goal of minimizing the expected regret. While this approach is useful in many areas, in clinical trials, it can be sensitive to outlier data, especially when the sample size is small. In this paper, we define and study a new robustness criterion for bandit problems. Specifically, we consider optimizing a function of the distribution of returns as a regret measure. This provides practitioners more flexibility to define an appropriate regret measure. The learning algorithm we propose to solve this type of problem is a modification of the BESA algorithm [Baransi et al., 2014], which considers a more general version of regret. We present a regret bound for our approach and evaluate it empirically both on synthetic problems as well as on a dataset from the clinical trial literature. Our approach compares favorably to a suite of standard bandit algorithms.

2019-03-03

ArXiv (prépublication)

Deep Learning for Automated Segmentation of Liver Lesions at CT in Patients with Colorectal Cancer Liver Metastases.

Eugene Vorontsov

Milena Cerny

Philippe Régnier

Lisa Di Jorio

Christopher Pal

Réal Lapointe

Franck Vandenbroucke-Menu

Simon Turcotte

Samuel Kadoury

An Tang

Purpose To evaluate the performance, agreement, and efficiency of a fully convolutional network (FCN) for liver lesion detection and segment… (voir plus)ation at CT examinations in patients with colorectal liver metastases (CLMs). Materials and Methods This retrospective study evaluated an automated method using an FCN that was trained, validated, and tested with 115, 15, and 26 contrast material-enhanced CT examinations containing 261, 22, and 105 lesions, respectively. Manual detection and segmentation by a radiologist was the reference standard. Performance of fully automated and user-corrected segmentations was compared with that of manual segmentations. The interuser agreement and interaction time of manual and user-corrected segmentations were assessed. Analyses included sensitivity and positive predictive value of detection, segmentation accuracy, Cohen κ, Bland-Altman analyses, and analysis of variance. Results In the test cohort, for lesion size smaller than 10 mm (n = 30), 10-20 mm (n = 35), and larger than 20 mm (n = 40), the detection sensitivity of the automated method was 10%, 71%, and 85%; positive predictive value was 25%, 83%, and 94%; Dice similarity coefficient was 0.14, 0.53, and 0.68; maximum symmetric surface distance was 5.2, 6.0, and 10.4 mm; and average symmetric surface distance was 2.7, 1.7, and 2.8 mm, respectively. For manual and user-corrected segmentation, κ values were 0.42 (95% confidence interval: 0.24, 0.63) and 0.52 (95% confidence interval: 0.36, 0.72); normalized interreader agreement for lesion volume was -0.10 ± 0.07 (95% confidence interval) and -0.10 ± 0.08; and mean interaction time was 7.7 minutes ± 2.4 (standard deviation) and 4.8 minutes ± 2.1 (P .001), respectively. Conclusion Automated detection and segmentation of CLM by using deep learning with convolutional neural networks, when manually corrected, improved efficiency but did not substantially change agreement on volumetric measurements.© RSNA, 2019Supplemental material is available for this article.

2019-02-28

Radiology: Artificial Intelligence (publié)

Reinforcement Learning in Stationary Mean-field Games

Jayakumar Subramanian

Aditya Mahajan

Multi-agent reinforcement learning has made significant progress in recent years, but it remains a hard problem. Hence, one often resorts to… (voir plus) developing learning algorithms for specific classes of multi-agent systems. In this paper we study reinforcement learning in a specific class of multi-agent systems systems called mean-field games. In particular, we consider learning in stationary mean-field games. We identify two different solution concepts---stationary mean-field equilibrium and stationary mean-field social-welfare optimal policy---for such games based on whether the agents are non-cooperative or cooperative, respectively. We then generalize these solution concepts to their local variants using bounded rationality based arguments. For these two local solution concepts, we present two reinforcement learning algorithms. We show that the algorithms converge to the right solution under mild technical conditions and demonstrate this using two numerical examples.

2019-02-28

Adaptive Agents and Multi-Agent Systems (publié)

Stochastic Bit-Wise Iterative Decoding of Polar Codes

Kaining Han

Junchao Wang

Warren J. Gross

Jianhao Hu

Polar codes have received recent attention due to their potential to be applied in advanced wireless communication protocols such as the fif… (voir plus)th generation mobile communication system (5G). Among the existing decoding algorithms, Belief Propagation (BP) exhibits high-throughput, low-latency, and soft output with a high hardware cost. Stochastic computing, as a form of approximate computing, provides a potential low-cost implementation solution for the BP algorithm. However, existing stochastic BP decoders suffer from a relatively long decoding latency resulting in low hardware efficiency. In this paper, a novel bit-wise iterative stochastic decoding architecture for the BP algorithm is proposed to improve the throughput and hardware efficiency. By utilizing the frozen bits of polar codes and stochastic computing, multiple novel optimization methods are presented to further speed up convergence and increase the hardware efficiency.

2019-02-28

IEEE Transactions on Signal Processing (publié)

Prediction of Progression in Multiple Sclerosis Patients

Adrian Tousignant

Paul Lemaitre

Douglas Arnold

Tal Arbel

We present the first automatic end-to-end deep learning framework for the prediction of future patient disability progression (one year from… (voir plus) baseline) based on multi-modal brain Magnetic Resonance Images (MRI) of patients with Multiple Sclerosis (MS). The model uses parallel convolutional pathways, an idea introduced by the popular Inception net and is trained and tested on two large proprietary, multi-scanner, multi-center, clinical trial datasets of patients with Relapsing-Remitting Multiple Sclerosis (RRMS). Experiments on 465 patients on the placebo arms of the trials indicate that the model can accurately predict future disease progression, measured by a sustained increase in the extended disability status scale (EDSS) score over time. Using only the multi-modal MRI provided at baseline, the model achieves an AUC of 0.66 +- 0.055. However, when supplemental lesion label masks are provided as inputs as well, the AUC increases to 0.701 +- 0.027. Furthermore, we demonstrate that uncertainty estimates based on Monte Carlo dropout sample variance correlate with errors made by the model. Clinicians provided with the predictions computed by the model can therefore use the associated uncertainty estimates to assess which scans require further examination.

2019-02-27

MIDL.io/2019/Conference (poster)

openreview.net

The Termination Critic

Anna Harutyunyan

Will Dabney

Diana Borsa

Nicolas Heess

Remi Munos

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We… (voir plus) propose an algorithm that focuses on the termination function, as opposed to - as is common - the policy. The termination function is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option’s encoding - arguably a key reason for using abstractions. To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a "critic" for the termination function. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning.

2019-02-25

ArXiv (prépublication)

proceedings.mlr.press

The Termination Critic

Anna Harutyunyan

Will Dabney

Diana Borsa

Nicolas Heess

Remi Munos

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We… (voir plus) propose an algorithm that focuses on the termination function, as opposed to - as is common - the policy. The termination function is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option’s encoding - arguably a key reason for using abstractions.To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a “critic” for the termination function. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning.

2019-02-25

ArXiv (prépublication)