Publications

Attend Before you Act: Leveraging human visual attention for continual learning

When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant … (see more)information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.

2018-07-24

ArXiv (preprint)

arxiv.org

Active Search of Connections for Case Building and Combating Human Trafficking

Reihaneh Rabbany

David Bayani

Artur Dubrawski

How can we help an investigator to efficiently connect the dots and uncover the network of individuals involved in a criminal activity based… (see more) on the evidence of their connections, such as visiting the same address, or transacting with the same bank account? We formulate this problem as Active Search of Connections, which finds target entities that share evidence of different types with a given lead, where their relevance to the case is queried interactively from the investigator. We present RedThread, an efficient solution for inferring related and relevant nodes while incorporating the user's feedback to guide the inference. Our experiments focus on case building for combating human trafficking, where the investigator follows leads to expose organized activities, i.e. different escort advertisements that are connected and possibly orchestrated. RedThread is a local algorithm and enables online case building when mining millions of ads posted in one of the largest classified advertising websites. The results of RedThread are interpretable, as they explain how the results are connected to the initial lead. We experimentally show that RedThread learns the importance of the different types and different pieces of evidence, while the former could be transferred between cases.

2018-07-18

Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (published)

doi.org

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants

Lara Kanbar

Charles Onu

Wissam Shalish

Karen A. Brown

Guilherme M. Sant’Anna

Robert E. Kearney

Doina Precup

Extremely preterm infants often require endotracheal intubation and mechanical ventilation during the first days of life. Due to the detrime… (see more)ntal effects of prolonged invasive mechanical ventilation (IMV), clinicians aim to extubate infants as soon as they deem them ready.Unfortunately, existing strategies for prediction of extubation readiness vary across clinicians and institutions, and lead to high reintubation rates. We present an approach using Random Forest classifiers for the analysis of cardiorespiratory variability to predict extubation readiness. We address the issue of data imbalance by employing random undersampling of examples from the majority class before training each Decision Tree in a bag. By incorporating clinical domain knowledge, we further demonstrate that our classifier could have identified 71% of infants who failed extubation, while maintaining a success detection rate of 78%.

2018-07-17

2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (published)

doi.org

arxiv.org

Eligibility Traces for Options

Ayush Jain

Doina Precup

Temporally extended actions not only represent knowledge in the hierarchical setup in reinforcement learning, they also improve exploration … (see more)while reducing the complexity of choosing actions. The option framework provides a concrete way to implement and reason about temporal abstraction. This work attempts to test the utility of eligibility traces with options and find good ways of doing multi-step intra-option updates. Three algorithms, based on off-policy methods - importance sampling, tree-backup and retrace, are proposed for using eligibility traces with options.

2018-07-08

International Joint Conference on Autonomous Agents and Multiagent Systems (published)

doi.org

Feature-wise transformations

Harm Vries

2018-07-08

Distill (published)

doi.org

MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation

Konstantinos Drossos

Stylianos Ioannis Mimilakis

Dmitriy Serdyuk

Gerald Schuller

Tuomas Virtanen

Yoshua Bengio

Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current st… (see more)ate of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.

2018-07-07

2018 International Joint Conference on Neural Networks (IJCNN) (published)

doi.org

arxiv.org

Information Fusion in Deep Convolutional Neural Networks for Biomedical Image Segmentation 1

Mohammad Havaei

Nicolas Guizard

Nicolas Chapados

Yoshua Bengio

2018-07-03

Signal Processing and Machine Learning for Biomedical Big Data (published)

doi.org

Addressing Function Approximation Error in Actor-Critic Methods

Scott Fujimoto

Herke van Hoof

David Meger

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated valu… (see more)e estimates and suboptimal policies. We show that this problem persists in an actor-critic setting and propose novel mechanisms to minimize its effects on both the actor and the critic. Our algorithm builds on Double Q-learning, by taking the minimum value between a pair of critics to limit overestimation. We draw the connection between target networks and overestimation bias, and suggest delaying policy updates to reduce per-update error and further improve performance. We evaluate our method on the suite of OpenAI gym tasks, outperforming the state of the art in every environment tested.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

proceedings.mlr.press

Augmented CycleGAN: Learning Many-to-Many Mappings from Unpaired Data

Amjad Almahairi

Sai Rajeswar

Alessandro Sordoni

Philip Bachman

Aaron Courville

Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by red… (see more)ucing the need for paired data. CycleGAN was recently proposed for this problem, but critically assumes the underlying inter-domain mapping is approximately deterministic and one-to-one. This assumption renders the model ineffective for tasks requiring flexible, many-to-many mappings. We propose a new model, called Augmented CycleGAN, which learns many-to-many mappings between domains. We examine Augmented CycleGAN qualitatively and quantitatively on several image datasets.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

proceedings.mlr.press

Convergent Tree Backup and Retrace with Function Approximation

Off-policy learning is key to scaling up reinforcement learning as it allows to learn about a target policy from the experience generated by… (see more) a different behavior policy. Unfortunately, it has been challenging to combine off-policy learning with function approximation and multi-step bootstrapping in a way that leads to both stable and efficient algorithms. In this work, we show that the \textsc{Tree Backup} and \textsc{Retrace} algorithms are unstable with linear function approximation, both in theory and in practice with specific examples. Based on our analysis, we then derive stable and efficient gradient-based algorithms using a quadratic convex-concave saddle-point formulation. By exploiting the problem structure proper to these algorithms, we are able to provide convergence guarantees and finite-sample bounds. The applicability of our new analysis also goes beyond \textsc{Tree Backup} and \textsc{Retrace} and allows us to provide new convergence rates for the GTD and GTD2 algorithms without having recourse to projections or Polyak averaging.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Focused Hierarchical RNNs for Conditional Sequence Processing

Nan Rosemary Ke

Adam Trischler

Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most o… (see more)f these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key parts of the input as needed. We formulate this using a multi-layer conditional sequence encoder that reads in one token at a time and makes a discrete decision on whether the token is relevant to the context or question being asked. The discrete gating mechanism takes in the context embedding and the current hidden state as inputs and controls information flow into the layer above. We train it using policy gradient methods. We evaluate this method on several types of tasks with different attributes. First, we evaluate the method on synthetic tasks which allow us to evaluate the model for its generalization ability and probe the behavior of the gates in more controlled settings. We then evaluate this approach on large scale Question Answering tasks including the challenging MS MARCO and SearchQA tasks. Our models shows consistent improvements for both tasks over prior work and our baselines. It has also shown to generalize significantly better on synthetic tasks as compared to the baselines.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Mutual Information Neural Estimation

Mohamed Ishmael Belghazi

Sai Rajeswar

R Devon Hjelm

We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent … (see more)over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly scalable in dimensionality as well as in sample size, trainable through back-prop, and strongly consistent. We present a handful of applications on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We also use MINE to implement Information Bottleneck, applying it to supervised classification; our results demonstrate substantial improvement in flexibility and performance in these settings.

2018-07-02

Proceedings of the 35th International Conference on Machine Learning (published)

doi.org

proceedings.mlr.press

Venture Scientist Bootcamp

Mila Techaide 2026

AI Advantage: Productivity in Public Service

Publications

Venture Scientist Bootcamp

Mila Techaide 2026

AI Advantage: Productivity in Public Service

Popular keywords:

Publications