Doina Precup

guangyuan.wang@mila.quebec

Guangyuan Wang

Research Intern - McGill University

Haque Ishfaq

PhD - McGill University

PhD - McGill University

huanghow@mila.quebec

Janarthanan Rajendran

Postdoctorate - Université de Montréal

Principal supervisor :

Sarath Chandar Anbil Parthipan

janarthanan.rajendran@mila.quebec

jonathan.colaco-carr@mila.quebec

Jaume Minano Masip

PhD - McGill University

masipmij@mila.quebec

Jesse Farebrother

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

Master's Research - McGill University

Principal supervisor :

Prakash Panangaden

Jonathan Lebensold

PhD - McGill University

Research Intern - McGill University

keyu.wang@mila.quebec

Kushal Arora

PhD - McGill University

Principal supervisor :

Lynn Cherif

Master's Research - McGill University

Co-supervisor :

lynn.cherif@mila.quebec

Mohammad Sami Nur Islam Islam

Mandana Samiei

PhD - McGill University

Co-supervisor :

PhD - McGill University

delvermm@mila.quebec

Martin Klissarov

PhD - McGill University

Harry Zhao

PhD - McGill University

Co-supervisor :

Research Intern - McGill University

mohammad-sami-nur.islam@mila.quebec

nathan.de-lara@mila.quebec

Nathan de Lara

Research Intern - McGill University

Nate Rahn

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

nathan.rahn@mila.quebec

Girdhar Neil Girdhar

Collaborating researcher - McGill University

neil.girdhar@mila.quebec

Nikhil Vemgal

Master's Research - McGill University

nikhil-murali.vemgal@mila.quebec

padideh.nouri@mila.quebec

Nishanth Anand Vemgal

PhD - McGill University

Master's Research - Université de Montréal

PhD - McGill University

Ray Chua

PhD - McGill University

Co-supervisor :

Blake Richards

chuaraym@mila.quebec

Riashat Islam

PhD - McGill University

Safa Alver

PhD - McGill University

alversaf@mila.quebec

Sahand Rezaei-Shoshtari

PhD - McGill University

Co-supervisor :

David Meger

sahand.rezaei-shoshtari@mila.quebec

PhD - McGill University

PhD - McGill University

Co-supervisor :

David Meger

fujimots@mila.quebec

Shahrad Mohammadzadeh

Collaborating researcher - McGill University

Principal supervisor :

Reihaneh Rabbany

shahrad.mohammadzadeh@mila.quebec

PhD - McGill University

PhD - McGill University

shuyuan.zhang@mila.quebec

Sitao Luan

PhD - McGill University

Steve Wen

Undergraduate - McGill University

steve.wen@mila.quebec

Sumana Basu

PhD - McGill University

Co-supervisor :

Adriana Romero Soriano

Master's Research - Université de Montréal

Principal supervisor :

Yoshua Bengio

thomas.jiralerspong@mila.quebec

PhD - McGill University

cheluver@mila.quebec

Wesley Chung

PhD - McGill University

Principal supervisor :

David Meger

chungwes@mila.quebec

Ray Luo

PhD - McGill University

Principal supervisor :

Xujie Si

luo.ziyan@mila.quebec

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Read the article

Publications

Improving Pathological Structure Segmentation via Transfer Learning Across Diseases

Barleen Kaur

Paul Lemaitre

Raghav Mehta

Nazanin Mohammadi Sepahvand

Douglas Arnold

2019-10-13

Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data (published)

Learning Options with Interest Functions

Learning temporal abstractions which are partial solutions to a task and could be reused for solving other tasks is an ingredient that can h… (see more)elp agents to plan and learn efficiently. In this work, we tackle this problem in the options framework. We aim to autonomously learn options which are specialized in different state space regions by proposing a notion of interest functions, which generalizes initiation sets from the options framework for function approximation. We build on the option-critic framework to derive policy gradient theorems for interest functions, leading to a new interest-option-critic architecture.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

Leveraging Observations in Bandits: Between Risks and Benefits

Andrei-stefan Lupu

Audrey Durand

Imitation learning has been widely used to speed up learning in novice agents, by allowing them to leverage existing data from experts. Allo… (see more)wing an agent to be influenced by external observations can benefit to the learning process, but it also puts the agent at risk of following sub-optimal behaviours. In this paper, we study this problem in the context of bandits. More specifically, we consider that an agent (learner) is interacting with a bandit-style decision task, but can also observe a target policy interacting with the same environment. The learner observes only the target’s actions, not the rewards obtained. We introduce a new bandit optimism modifier that uses conditional optimism contingent on the actions of the target in order to guide the agent’s exploration. We analyze the effect of this modification on the well-known Upper Confidence Bound algorithm by proving that it preserves a regret upper-bound of order O(lnT), even in the presence of a very poor target, and we derive the dependency of the expected regret on the general target policy. We provide empirical results showing both great benefits as well as certain limitations inherent to observational learning in the multi-armed bandit setting. Experiments are conducted using targets satisfying theoretical assumptions with high probability, thus narrowing the gap between theory and application.

2019-07-17

Proceedings of the AAAI Conference on Artificial Intelligence (published)

Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data

Adrian Tousignant

Paul Lemaitre

Douglas Arnold

We present the first automatic end-to-end deep learning framework for the prediction of future patient disability progression (one year from… (see more) baseline) based on multi-modal brain Magnetic Resonance Images (MRI) of patients with Multiple Sclerosis (MS). The model uses parallel convolutional pathways, an idea introduced by the popular Inception net (Szegedy et al., 2015) and is trained and tested on two large proprietary, multi-scanner, multi-center, clinical trial datasets of patients with Relapsing-Remitting Multiple Sclerosis (RRMS). Experiments on 465 patients on the placebo arms of the trials indicate that the model can accurately predict future disease progression, measured by a sustained increase in the extended disability status scale (EDSS) score over time. Using only the multi-modal MRI provided at baseline, the model achieves an AUC of 0.66±0.055. However, when supplemental lesion label masks are provided as inputs as well, the AUC increases to 0.701± 0.027. Furthermore, we demonstrate that uncertainty estimates based on Monte Carlo dropout sample variance correlate with errors made by the model. Clinicians provided with the predictions computed by the model can therefore use the associated uncertainty estimates to assess which scans require further examination.

2019-05-24

International Conference on Medical Imaging with Deep Learning (published)

proceedings.mlr.press

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning

Guillaume Rabusseau

Tianyu Li

In this paper, we unravel a fundamental connection between weighted finite automata~(WFAs) and second-order recurrent neural networks~(2-RNN… (see more)s): in the case of sequences of discrete symbols, WFAs and 2-RNNs with linear activation functions are expressively equivalent. Motivated by this result, we build upon a recent extension of the spectral learning algorithm to vector-valued WFAs and propose the first provable learning algorithm for linear 2-RNNs defined over sequences of continuous input vectors. This algorithm relies on estimating low rank sub-blocks of the so-called Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed method are assessed in a simulation study.

2019-04-11

Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (published)

proceedings.mlr.press

Prediction of Progression in Multiple Sclerosis Patients

Adrian Tousignant

Paul Lemaitre

Douglas Arnold

We present the first automatic end-to-end deep learning framework for the prediction of future patient disability progression (one year from… (see more) baseline) based on multi-modal brain Magnetic Resonance Images (MRI) of patients with Multiple Sclerosis (MS). The model uses parallel convolutional pathways, an idea introduced by the popular Inception net and is trained and tested on two large proprietary, multi-scanner, multi-center, clinical trial datasets of patients with Relapsing-Remitting Multiple Sclerosis (RRMS). Experiments on 465 patients on the placebo arms of the trials indicate that the model can accurately predict future disease progression, measured by a sustained increase in the extended disability status scale (EDSS) score over time. Using only the multi-modal MRI provided at baseline, the model achieves an AUC of 0.66 +- 0.055. However, when supplemental lesion label masks are provided as inputs as well, the AUC increases to 0.701 +- 0.027. Furthermore, we demonstrate that uncertainty estimates based on Monte Carlo dropout sample variance correlate with errors made by the model. Clinicians provided with the predictions computed by the model can therefore use the associated uncertainty estimates to assess which scans require further examination.

2019-02-28

MIDL.io/2019/Conference (poster)

openreview.net

The Termination Critic

Anna Harutyunyan

Will Dabney

Diana L. Borsa

Nicolas Heess

Remi Munos

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We… (see more) propose an algorithm that focuses on the termination function, as opposed to - as is common - the policy. The termination function is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option’s encoding - arguably a key reason for using abstractions.To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a “critic” for the termination function. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning.

2019-02-26

ArXiv (preprint)

The Termination Critic

Anna Harutyunyan

Will Dabney

Diana Borsa

Nicolas Heess

Remi Munos

2019-01-01

AISTATS (published)

Clustering-Oriented Representation Learning with Attractive-Repulsive Loss

Kian Kenyon-Dean

Andre Cianflone

Lucas Caccia

Guillaume Rabusseau

Jackie Cheung

The standard loss function used to train neural network classifiers, categorical cross-entropy (CCE), seeks to maximize accuracy on the trai… (see more)ning data; building useful representations is not a necessary byproduct of this objective. In this work, we propose clustering-oriented representation learning (COREL) as an alternative to CCE in the context of a generalized attractive-repulsive loss framework. COREL has the consequence of building latent representations that collectively exhibit the quality of natural clustering within the latent space of the final hidden layer, according to a predefined similarity function. Despite being simple to implement, COREL variants outperform or perform equivalently to CCE in a variety of scenarios, including image and news article classification using both feed-forward and convolutional neural networks. Analysis of the latent spaces created with different similarity functions facilitates insights on the different use cases COREL variants can satisfy, where the Cosine-COREL variant makes a consistently clusterable latent space, while Gaussian-COREL consistently obtains better classification accuracy than CCE.

2018-12-18

ArXiv (preprint)

Environments for Lifelong Reinforcement Learning

Sarath Chandar Anbil Parthipan

Shagun Sodhani

To achieve general artificial intelligence, reinforcement learning (RL) agents should learn not only to optimize returns for one specific ta… (see more)sk but also to constantly build more complex skills and scaffold their knowledge about the world, without forgetting what has already been learned. In this paper, we discuss the desired characteristics of environments that can support the training and evaluation of lifelong reinforcement learning agents, review existing environments from this perspective, and propose recommendations for devising suitable environments in the future.

2018-11-26

ArXiv (preprint)

Exploring Uncertainty Measures in Deep Networks for Multiple Sclerosis Lesion Detection and Segmentation

Tanya Nair

Douglas Arnold

2018-09-26

Medical Image Computing and Computer Assisted Intervention – MICCAI 2018 (published)

Attend Before you Act: Leveraging human visual attention for continual learning

When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant … (see more)information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.

2018-07-25

ArXiv (preprint)