Publications

Gated Orthogonal Recurrent Units: On Learning to Forget

Li Jing

Caglar Gulcehre

John Peurifoy

Yichen Shen

Max Tegmark

Marin Soljacic

We present a novel recurrent neural network (RNN)–based model that combines the remembering ability of unitary evolution RNNs with the abi… (voir plus)lity of gated RNNs to effectively forget redundant or irrelevant information in its memory. We achieve this by extending restricted orthogonal evolution RNNs with a gating mechanism similar to gated recurrent unit RNNs with a reset gate and an update gate. Our model is able to outperform long short-term memory, gated recurrent units, and vanilla unitary or orthogonal RNNs on several long-term-dependency benchmark tasks. We empirically show that both orthogonal and unitary RNNs lack the ability to forget. This ability plays an important role in RNNs. We provide competitive results along with an analysis of our model on many natural sequential tasks, including question answering, speech spectrum prediction, character-level language modeling, and synthetic tasks that involve long-term dependencies such as algorithmic, denoising, and copying tasks.

2019-04-01

Neural Computation (publié)

Multi-Agent Estimation and Filtering for Minimizing Team Mean-Squared Error

Mohammad Afshari

Aditya Mahajan

Motivated by estimation problems arising in autonomous vehicles and decentralized control of unmanned aerial vehicles, we consider multi-age… (voir plus)nt estimation and filtering problems in which multiple agents generate state estimates based on decentralized information and the objective is to minimize a coupled mean-squared error which we call team mean-square error. We call the resulting estimates as minimum team mean-squared error (MTMSE) estimates. We show that MTMSE estimates are different from minimum mean-squared error (MMSE) estimates. We derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error. We then consider a filtering problem where a linear stochastic process is monitored by multiple agents which can share their observations (with delay) over a communication graph. We derive expressions to recursively compute the MTMSE estimates. To illustrate the effectiveness of the proposed scheme we consider an example of estimating the distances between vehicles in a platoon and show that MTMSE estimates significantly outperform MMSE estimates and consensus Kalman filtering estimates.

2019-03-28

ArXiv (preprint)

Towards Standardization of Data Licenses: The Montreal Data License

Misha Benjamin

P. Gagnon

Negar Rostamzadeh

Chris Pal

Alex Shee

This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is … (voir plus)to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper's approach is summarized in a new family of data license language - \textit{the Montreal Data License (MDL)}. Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper.

2019-03-21

ArXiv (prépublication)

Online continual learning with no task boundaries

Rahaf Aljundi

Min Lin

Baptiste Goujaud

Continual learning is the ability of an agent to learn online with a non-stationary and never-ending stream of data. A key component for suc… (voir plus)h never-ending learning process is to overcome the catastrophic forgetting of previously seen data, a problem that neural networks are well known to suffer from. The solutions developed so far often relax the problem of continual learning to the easier task-incremental setting, where the stream of data is divided into tasks with clear boundaries. In this paper, we break the limits and move to the more challenging online setting where we assume no information of tasks in the data stream. We start from the idea that each learning step should not increase the losses of the previously learned examples through constraining the optimization process. This means that the number of constraints grows linearly with the number of examples, which is a serious limitation. We develop a solution to select a ﬁxed number of constraints that we use to approximate the feasible region deﬁned by the original constraints. We compare our approach against the methods that rely on task boundaries to select a ﬁxed set of examples, and show comparable or even better results, especially when the boundaries are blurry or when the data distributions are imbalanced.

2019-03-20

arXiv.org (prépublication)

dblp.uni-trier.de

Automated segmentation of cortical layers in BigBrain reveals divergent cortical and laminar thickness gradients in sensory and motor cortices.

Konrad Wagstyl

Stéphanie Larocque

Guillem Cucurull

Claude Lepage

Joseph Paul Cohen

Sebastian Bludau

Nicola Palomero-Gallagher

L. Lewis

Thomas Funck

Hannah Spitzer

Timo Dicksheid

Paul C Fletcher

Adriana Romero Soriano

Karl Zilles

Katrin Amunts

Alan C. Evans

Abstract Large-scale in vivo neuroimaging datasets offer new possibilities for reliable, well-powered measures of interregional structural d… (voir plus)ifferences and biomarkers of pathological changes in a wide variety of neurological and psychiatric diseases. However, so far studies have been structurally and functionally imprecise, being unable to relate pathological changes to specific cortical layers or neurobiological processes. We developed artificial neural networks to segment cortical and laminar surfaces in the BigBrain, a 3D histological model of the human brain. We sought to test whether previously-reported thickness gradients, as measured by MRI, in sensory and motor processing cortices, were present in a histological atlas of cortical thickness, and which cortical layers were contributing to these gradients. Identifying common gradients of cortical organisation enables us to meaningfully relate microstructural, macrostructural and functional cortical parameters. Analysis of thickness gradients across sensory cortices, using our fully segmented six-layered model, was consistent with MRI findings, showing increasing thickness moving up the processing hierarchy. In contrast, fronto-motor cortices showed the opposite pattern with changes in thickness of layers III, V and VI being the primary drivers of these gradients. As well as identifying key differences between sensory and motor gradients, our findings show how the use of this laminar atlas offers insights that will be key to linking single-neuron morphological changes, mesoscale cortical layers and macroscale cortical thickness.

2019-03-17

bioRxiv (prépublication)

BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices

Konrad Wagstyl

Stéphanie Larocque

Guillem Cucurull

Claude Lepage

Joseph Paul Cohen

Sebastian Bludau

Nicola Palomero-Gallagher

L. Lewis

Thomas Funck

Hannah Spitzer

Timo Dicksheid

Paul C Fletcher

Adriana Romero Soriano

Karl Zilles

Katrin Amunts

Alan C. Evans

Histological atlases of the cerebral cortex, such as those made famous by Brodmann and von Economo, are invaluable for understanding human b… (voir plus)rain microstructure and its relationship with functional organization in the brain. However, these existing atlases are limited to small numbers of manually annotated samples from a single cerebral hemisphere, measured from 2D histological sections. We present the first whole-brain quantitative 3D laminar atlas of the human cerebral cortex. This atlas was derived from a 3D histological model of the human brain at 20 micron isotropic resolution (BigBrain), using a convolutional neural network to segment, automatically, the cortical layers in both hemispheres. Our approach overcomes many of the historical challenges with measurement of histological thickness in 2D and the resultant laminar atlas provides an unprecedented level of precision and detail. We utilized this BigBrain cortical atlas to test whether previously reported thickness gradients, as measured by MRI in sensory and motor processing cortices, were present in a histological atlas of cortical thickness, and which cortical layers were contributing to these gradients. Cortical thickness increased across sensory processing hierarchies, primarily driven by layers III, V and VI. In contrast, fronto-motor cortices showed the opposite pattern, with decreases in total and pyramidal layer thickness. These findings illustrate how this laminar atlas will provide a link between single-neuron morphology, mesoscale cortical layering, macroscopic cortical thickness and, ultimately, functional neuroanatomy.

2019-03-17

bioRxiv (prépublication)

BigBrain 3D atlas of cortical layers: Cortical and laminar thickness gradients diverge in sensory and motor cortices

Konrad Wagstyl

Stéphanie Larocque

Guillem Cucurull

Claude Lepage

Joseph Paul Cohen

Sebastian Bludau

Nicola Palomero-Gallagher

L. Lewis

Thomas Funck

Hannah Spitzer

Timo Dicksheid

Paul C Fletcher

Adriana Romero Soriano

Karl Zilles

Katrin Amunts

Alan C. Evans

Histological atlases of the cerebral cortex, such as those made famous by Brodmann and von Economo, are invaluable for understanding human b… (voir plus)rain microstructure and its relationship with functional organization in the brain. However, these existing atlases are limited to small numbers of manually annotated samples from a single cerebral hemisphere, measured from 2D histological sections. We present the first whole-brain quantitative 3D laminar atlas of the human cerebral cortex. This atlas was derived from a 3D histological model of the human brain at 20 micron isotropic resolution (BigBrain), using a convolutional neural network to segment, automatically, the cortical layers in both hemispheres. Our approach overcomes many of the historical challenges with measurement of histological thickness in 2D and the resultant laminar atlas provides an unprecedented level of precision and detail. We utilized this BigBrain cortical atlas to test whether previously reported thickness gradients, as measured by MRI in sensory and motor processing cortices, were present in a histological atlas of cortical thickness, and which cortical layers were contributing to these gradients. Cortical thickness increased across sensory processing hierarchies, primarily driven by layers III, V and VI. In contrast, fronto-motor cortices showed the opposite pattern, with decreases in total and pyramidal layer thickness. These findings illustrate how this laminar atlas will provide a link between single-neuron morphology, mesoscale cortical layering, macroscopic cortical thickness and, ultimately, functional neuroanatomy.

2019-03-17

bioRxiv (prépublication)

Interpolation Consistency Training for Semi-Supervised Learning

Vikas Verma

Alex Lamb

Juho Kannala

David Lopez-Paz

2019-03-09

ArXiv (preprint)

LF-PPL: A Low-Level First Order Probabilistic Programming Language for Non-Differentiable Models

Yuanshuo Zhou

Bradley Gram-Hansen

Tobias Kohn

Tom Rainforth

Hongseok Yang

F. Wood

We develop a new Low-level, First-order Probabilistic Programming Language~(LF-PPL) suited for models containing a mix of continuous, discre… (voir plus)te, and/or piecewise-continuous variables. The key success of this language and its compilation scheme is in its ability to automatically distinguish parameters the density function is discontinuous with respect to, while further providing runtime checks for boundary crossings. This enables the introduction of new inference engines that are able to exploit gradient information, while remaining efficient for models which are not everywhere differentiable. We demonstrate this ability by incorporating a discontinuous Hamiltonian Monte Carlo (DHMC) inference engine that is able to deliver automated and efficient inference for non-differentiable models. Our system is backed up by a mathematical formalism that ensures that any model expressed in this language has a density with measure zero discontinuities to maintain the validity of the inference engine.

2019-03-06

ArXiv (preprint)

Learning Dynamics Model in Reinforcement Learning by Incorporating the Long Term Future

Nan Rosemary Ke

Amanpreet Singh

Ahmed Touati

Anirudh Goyal

Devi Parikh

Dhruv Batra

In model-based reinforcement learning, the agent interleaves between model learning and planning. These two components are inextricably inte… (voir plus)rtwined. If the model is not able to provide sensible long-term prediction, the executed planner would exploit model flaws, which can yield catastrophic failures. This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration. To this end, we build a latent-variable autoregressive model by leveraging recent ideas in variational inference. We argue that forcing latent variables to carry future information through an auxiliary task substantially improves long-term predictions. Moreover, by planning in the latent space, the planner's solution is ensured to be within regions where the model is valid. An exploration strategy can be devised by searching for unlikely trajectories under the model. Our method achieves higher reward faster compared to baselines on a variety of tasks and environments in both the imitation learning and model-based reinforcement learning settings.

2019-03-05

ArXiv (prépublication)

Deep Learning for Automated Segmentation of Liver Lesions at CT in Patients with Colorectal Cancer Liver Metastases.

Eugene Vorontsov

Milena Cerny

Philippe Régnier

Lisa Di Jorio

Chris Pal

Réal Lapointe

Franck Vandenbroucke-Menu

Simon Turcotte

Samuel Kadoury

An Tang

Purpose To evaluate the performance, agreement, and efficiency of a fully convolutional network (FCN) for liver lesion detection and segment… (voir plus)ation at CT examinations in patients with colorectal liver metastases (CLMs). Materials and Methods This retrospective study evaluated an automated method using an FCN that was trained, validated, and tested with 115, 15, and 26 contrast material-enhanced CT examinations containing 261, 22, and 105 lesions, respectively. Manual detection and segmentation by a radiologist was the reference standard. Performance of fully automated and user-corrected segmentations was compared with that of manual segmentations. The interuser agreement and interaction time of manual and user-corrected segmentations were assessed. Analyses included sensitivity and positive predictive value of detection, segmentation accuracy, Cohen κ, Bland-Altman analyses, and analysis of variance. Results In the test cohort, for lesion size smaller than 10 mm (n = 30), 10-20 mm (n = 35), and larger than 20 mm (n = 40), the detection sensitivity of the automated method was 10%, 71%, and 85%; positive predictive value was 25%, 83%, and 94%; Dice similarity coefficient was 0.14, 0.53, and 0.68; maximum symmetric surface distance was 5.2, 6.0, and 10.4 mm; and average symmetric surface distance was 2.7, 1.7, and 2.8 mm, respectively. For manual and user-corrected segmentation, κ values were 0.42 (95% confidence interval: 0.24, 0.63) and 0.52 (95% confidence interval: 0.36, 0.72); normalized interreader agreement for lesion volume was -0.10 ± 0.07 (95% confidence interval) and -0.10 ± 0.08; and mean interaction time was 7.7 minutes ± 2.4 (standard deviation) and 4.8 minutes ± 2.1 (P .001), respectively. Conclusion Automated detection and segmentation of CLM by using deep learning with convolutional neural networks, when manually corrected, improved efficiency but did not substantially change agreement on volumetric measurements.© RSNA, 2019Supplemental material is available for this article.

2019-03-01

Radiology: Artificial Intelligence (publié)