Publications

Online variance-reducing optimization

Nicolas Roux

Reza Babanezhad Harikandeh

Reza Babanezhad

Pierre-Antoine Manzagol

2018-02-11

International Conference on Learning Representations (published)

openreview.net

SGD S MOOTHS THE S HARPEST D IRECTIONS

Stanisław Jastrzębski

Amos Storkey

Stochastic gradient descent (SGD) is able to find regions that generalize well, even in drastically over-parametrized models such as deep ne… (see more)ural networks. We observe that noise in SGD controls the spectral norm and conditioning of the Hessian throughout the training. We hypothesize the cause of this phenomenon is due to the dynamics of neurons saturating their non-linearity along the largest curvature directions, thus leading to improved conditioning.

2018-02-11

(published)

openreview.net

Extending the Framework of Equilibrium Propagation to General Dynamics

2018-02-10

International Conference on Learning Representations (published)

openreview.net

Learning Robust Options

Daniel J. Mankowitz

Timothy A. Mann

Pierre-Luc Bacon

Doina Precup

Shie Mannor

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose … (see more)parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty. We utilize ROPI to learn robust options with the Robust Options Deep Q Network (RO-DQN) that solves multiple tasks and mitigates model misspecification due to model uncertainty. We present experimental results which suggest that policy iteration with linear features may have an inherent form of robustness when using coarse feature representations. In addition, we present experimental results which demonstrate that robustness helps policy iteration implemented on top of deep neural networks to generalize over a much broader range of dynamics than non-robust policy iteration.

2018-02-08

ArXiv (preprint)

doi.org

arxiv.org

Hierarchical Adversarially Learned Inference

Ishmael Belghazi

Sai Rajeswar

We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model. Both the generative … (see more)and inference model are trained using the adversarial learning paradigm. We demonstrate that the hierarchical structure supports the learning of progressively more abstract representations as well as providing semantically meaningful reconstructions with different levels of fidelity. Furthermore, we show that minimizing the Jensen-Shanon divergence between the generative and inference network is enough to minimize the reconstruction error. The resulting semantically meaningful hierarchical latent structure discovery is exemplified on the CelebA dataset. There, we show that the features learned by our model in an unsupervised way outperform the best handcrafted features. Furthermore, the extracted features remain competitive when compared to several recent deep supervised approaches on an attribute prediction task on CelebA. Finally, we leverage the model's inference network to achieve state-of-the-art performance on a semi-supervised variant of the MNIST digit classification task.

2018-02-03

ArXiv (preprint)

openreview.net

Patterns of reintubation in extremely preterm infants: a longitudinal cohort study

Wissam Shalish

Lara Kanbar

Martin Keszler

Sanjay Chawla

Lajos Kovacs

Smita Rao

Bogdan A Panaitescu

Alyse Laliberte

Doina Precup

Karen Brown

Robert E Kearney

Guilherme M Sant'Anna

2018-01-30

Pediatric Research (published)

doi.org

Combining intraoperative ultrasound brain shift correction and augmented reality visualizations: a pilot study of eight cases

Ian J. Gerard

Marta Kersten-Oertel

Simon Drouin

Jeffery A. Hall

Kevin Petrecca

Dante De Nigris

Daniel A. Di Giovanni

Tal Arbel

D. Louis Collins

2018-01-25

Journal of Medical Imaging (published)

doi.org

A3T: Adversarially Augmented Adversarial Training

Recent research showed that deep neural networks are highly sensitive to so-called adversarial perturbations, which are tiny perturbations o… (see more)f the input data purposely designed to fool a machine learning classifier. Most classification models, including deep learning models, are highly vulnerable to adversarial attacks. In this work, we investigate a procedure to improve adversarial robustness of deep neural networks through enforcing representation invariance. The idea is to train the classifier jointly with a discriminator attached to one of its hidden layer and trained to filter the adversarial noise. We perform preliminary experiments to test the viability of the approach and to compare it to other standard adversarial training methods.

2018-01-11

ArXiv (preprint)

arxiv.org

Modular Networks for Validating Community Detection Algorithms

Justin J Fagnan

Afra Abnar

Reihaneh Rabbany

Osmar R Zaiane

How can we accurately compare different community detection algorithms? These algorithms cluster nodes in a given network, and their perform… (see more)ance is often validated on benchmark networks with explicit ground-truth communities. Given the lack of cluster labels in real-world networks, a model that generates realistic networks is required for accurate evaluation of these algorithm. In this paper, we present a simple, intuitive, and flexible benchmark generator to generate intrinsically modular networks for community validation. We show how the generated networks closely comply with the characteristics observed for real networks; whereas their characteristics could be directly controlled to match wide range of real world networks. We further show how common community detection algorithms rank differently when being evaluated on these benchmarks compared to current available alternatives.

2018-01-03

ArXiv (preprint)

arxiv.org

Accelerated Stochastic Power Iteration

Peng Xu

Bryan Dawei He

Christopher De Sa

Ioannis Mitliagkas

Christopher Re

Principal component analysis (PCA) is one of the most powerful tools in machine learning. The simplest method for PCA, the power iteration, … (see more)requires O ( 1 / Δ ) full-data passes to recover the principal component of a matrix with eigen-gap Δ. Lanczos, a significantly more complex method, achieves an accelerated rate of O ( 1 / Δ ) passes. Modern applications, however, motivate methods that only ingest a subset of available data, known as the stochastic setting. In the online stochastic setting, simple algorithms like Oja's iteration achieve the optimal sample complexity O ( σ 2 / Δ 2 ) . Unfortunately, they are fully sequential, and also require O ( σ 2 / Δ 2 ) iterations, far from the O ( 1 / Δ ) rate of Lanczos. We propose a simple variant of the power iteration with an added momentum term, that achieves both the optimal sample and iteration complexity. In the full-pass setting, standard analysis shows that momentum achieves the accelerated rate, O ( 1 / Δ ) . We demonstrate empirically that naively applying momentum to a stochastic method, does not result in acceleration. We perform a novel, tight variance analysis that reveals the "breaking-point variance" beyond which this acceleration does not occur. By combining this insight with modern variance reduction techniques, we construct stochastic PCA algorithms, for the online and offline setting, that achieve an accelerated iteration complexity O ( 1 / Δ ) . Due to the embarassingly parallel nature of our methods, this acceleration translates directly to wall-clock time if deployed in a parallel environment. Our approach is very general, and applies to many non-convex optimization problems that can now be accelerated using the same technique.

2017-12-31

AISTATS (published)

proceedings.mlr.press

Advances in Artificial Intelligence

Ebrahim Bagheri

Jackie CK Cheung

2017-12-31

Lecture Notes in Computer Science (published)

doi.org

Analyzing Alzheimer’s Disease Progression from Sequential Magnetic Resonance Imaging Scans Using Deep 3D Convolutional Neural Networks

Sumana Basu

Konrad Wagstyl

Azar Zandifar

Louis Collins

Adriana Romero

Doina Precup

Alzheimer’s is a progressive, neurodegenerative disease, that causes irreversible damage to the brain tissue. It impairs the ability to fo… (see more)rm and retrieve memory, and eventually disrupts the natural flow of life, by affecting the ability to carry out even day to day activities. The disease is typically diagnosed from the symptoms (Mini Mental State Examination, [8]), such as decline in cognitive abilities, visual and/or speech impairment, loss of memory, rather than the structural changes in the brain (biomarker) that causes it. But the pathological changes in the brain start decades before the manifestation of the symptoms [7]. Magnetic Resonance Imaging (MRI) is capable of capturing the complex changes in the brain, even if it is difficult for humans to extract those features from the low contrast, multi-dimensional MRIs [1]. There is a considerable amount of work on analyzing Alzheimer’s disease. However, the vast majority intends to predict the state of the disease at the current time step.

2017-12-31

(published)

www.semanticscholar.org

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications