Doina Precup

Samin Yeasar Arnob

PhD - McGill University

Sumana Basu

Collaborating Alumni - McGill University

Co-supervisor :

Adriana Romero Soriano

Collaborating Alumni - McGill University

Raymond Chua

PhD - McGill University

Co-supervisor :

PhD - McGill University

Principal supervisor :

David Meger

Jonathan Colaço Carr

Master's Research - McGill University

Principal supervisor :

Prakash Panangaden

Élodie Coté-Gauthier

Collaborating researcher - McGill University

Franco Del Balso

Collaborating researcher - Université de Montréal

Jesse Farebrother

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

PhD - McGill University

Principal supervisor :

Collaborating researcher - Birla Institute of Technology

Jonathan Hu

Master's Research - McGill University

Howard Huang

PhD - McGill University

Haque Ishfaq

Collaborating Alumni - McGill University

Mohammad Sami Nur Islam Islam

Master's Research - McGill University

Hangzhan Jin

PhD - Polytechnique Montréal

Martin Klissarov

PhD - McGill University

Postdoctorate - McGill University

Jonathan Lebensold

Collaborating Alumni - McGill University

Collaborating Alumni - McGill University

Ray Luo

PhD - McGill University

Principal supervisor :

G McCracken

PhD - McGill University

Nazanin Mohammadi Sepahvand

Collaborating Alumni - McGill University

Shahrad Mohammadzadeh

Master's Research - McGill University

Principal supervisor :

Gabriela Moisescu-Pareja

Collaborating researcher - McGill University

Co-supervisor :

Irina Rish

Padideh Nouri

PhD - Université de Montréal

Co-supervisor :

PhD - McGill University

Co-supervisor :

Research Intern - McGill University

Nate Rahn

PhD - McGill University

Principal supervisor :

Marc Gendron-Bellemare

Manoosh Samiei

PhD - McGill University

Co-supervisor :

PhD - McGill University

Co-supervisor :

PhD - McGill University

Nishanth Anand Vemgal

PhD - McGill University

PhD - McGill University

Co-supervisor :

Samira Ebrahimi Kahou

Research Intern - McGill University

Zihan Wang

PhD - McGill University

Skipper: Combining Spatial and Temporal Abstraction for Better Generalization

Steve Wen

Master's Research - McGill University

Co-supervisor :

Gregory Dudek

Zijing Wu

PhD - McGill University

Principal supervisor :

PhD - McGill University

Harry Zhao

Collaborating Alumni - McGill University

Co-supervisor :

Blog Posts

Generic thumbnail for Mila Blog articles.

February 22, 2024

Mingde Harry Zhao

Safa Alver

Harm van Seijen

Romain Laroche

Doina Precup

Yoshua Bengio

Read the article

Publications

Ubenwa: Cry-based Diagnosis of Birth Asphyxia

Charles Onu

Innocent Udeogu

Eyenimi Ndiomu

Urbain Kengni

Guilherme M. Sant’Anna

E. Alikor

P. Opara

Every year, 3 million newborns die within the first month of life. Birth asphyxia and other breathing-related conditions are a leading cause… (see more) of mortality during the neonatal phase. Current diagnostic methods are too sophisticated in terms of equipment, required expertise, and general logistics. Consequently, early detection of asphyxia in newborns is very difficult in many parts of the world, especially in resource-poor settings. We are developing a machine learning system, dubbed Ubenwa, which enables diagnosis of asphyxia through automated analysis of the infant cry. Deployed via smartphone and wearable technology, Ubenwa will drastically reduce the time, cost and skill required to make accurate and potentially life-saving diagnoses.

2017-11-16

ArXiv (preprint)

Neural Network Based Nonlinear Weighted Finite Automata

Tianyu Li

Guillaume Rabusseau

Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent succ… (see more)esses of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.

2017-09-12

ArXiv (preprint)

Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation

Andrew Doyle

Douglas Arnold

Tal Arbel

2017-09-03

Medical Image Computing and Computer Assisted Intervention − MICCAI 2017 (published)

Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation

Jesús Alejandro Cárdenes Cabré

Ricardo Sanz

Over-booking or under-booking of computing resources leads to higher cost and performance degradation of web applications. To optimize the p… (see more)erformance of web applications, access to the resources has to be dynamically controlled ensuring maximum cost-performance ratio of the application while fulfilling requirements. To simplify the design of dynamic cloud controllers, we propose a horizontal and vertical scalability self-aware agent defined by a self-adaptive fuzzy logic with an oriented random optimizer based on reward and memory. The algorithm dynamically adjusts the membership functions and their relationship, maximizing the reward of the system while considering the cost related to the deployment of new resources. The evaluation of the controller under real cloud workload reveals the ability of the algorithm to maximize the performance of the web application based on the target parameters given by an operator.

2017-08-31

International Conference on Cloud and Autonomic Computing (published)

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions

Teng Long

Emmanuel Bengio

Ryan Lowe

Jackie CK Cheung

Humans interpret texts with respect to some background information, or world knowledge, and we would like to develop automatic reading compr… (see more)ehension systems that can do the same. In this paper, we introduce a task and several models to drive progress towards this goal. In particular, we propose the task of rare entity prediction: given a web document with several entities removed, models are tasked with predicting the correct missing entities conditioned on the document context and the lexical resources. This task is challenging due to the diversity of language styles and the extremely large number of rare entities. We propose two recurrent neural network architectures which make use of external knowledge in the form of entity descriptions. Our experiments show that our hierarchical LSTM model performs significantly better at the rare entity prediction task than those that do not make use of external resources.

2017-08-31

Conference on Empirical Methods in Natural Language Processing (published)

Approximate Value Iteration with Temporally Extended Actions (Extended Abstract)

Timothy A. Mann

Shie Mannor

The options framework provides a concrete way to implement and reason about temporally extended actions. Existing literature has demonstrate… (see more)d the value of planning with options empirically, but there is a lack of theoretical analysis formalizing when planning with options is more efficient than planning with primitive actions. We provide a general analysis of the convergence rate of a popular Approximate Value Iteration (AVI) algorithm called Fitted Value Iteration (FVI) with options. Our analysis reveals that longer duration options and a pessimistic estimate of the value function both lead to faster convergence. Furthermore, options can improve convergence even when they are suboptimal and sparsely distributed throughout the state space. Next we consider generating useful options for planning based on a subset of landmark states. This suggests a new algorithm, Landmarkbased AVI (LAVI), that represents the value function only at landmark states. We analyze OFVI and LAVI using the proposed landmark-based options and compare the two algorithms. Our theoretical and experimental results demonstrate that options can play an important role in AVI by decreasing approximation error and inducing fast convergence.

2017-07-31

International Joint Conference on Artificial Intelligence (published)

Prediction of Extubation readiness in extremely preterm infants by the automated analysis of cardiorespiratory behavior: study protocol

Wissam Shalish

Lara J. Kanbar

Smita Rao

Carlos A. Robles-Rubio

Lajos Kovacs

Sanjay Chawla

Martin Keszler

Karen Brown

Robert E. Kearney

Guilherme M. Sant’Anna

BackgroundExtremely preterm infants (≤ 28 weeks gestation) commonly require endotracheal intubation and mechanical ventilation (MV) to ma… (see more)intain adequate oxygenation and gas exchange. Given that MV is independently associated with important adverse outcomes, efforts should be made to limit its duration. However, current methods for determining extubation readiness are inaccurate and a significant number of infants fail extubation and require reintubation, an intervention that may be associated with increased morbidities. A variety of objective measures have been proposed to better define the optimal time for extubation, but none have proven clinically useful. In a pilot study, investigators from this group have shown promising results from sophisticated, automated analyses of cardiorespiratory signals as a predictor of extubation readiness. The aim of this study is to develop an automated predictor of extubation readiness using a combination of clinical tools along with novel and automated measures of cardiorespiratory behavior, to assist clinicians in determining when extremely preterm infants are ready for extubation.MethodsIn this prospective, multicenter observational study, cardiorespiratory signals will be recorded from 250 eligible extremely preterm infants with birth weights ≤1250 g immediately prior to their first planned extubation. Automated signal analysis algorithms will compute a variety of metrics for each infant, and machine learning methods will then be used to find the optimal combination of these metrics together with clinical variables that provide the best overall prediction of extubation readiness. Using these results, investigators will develop an Automated system for Prediction of EXtubation (APEX) readiness that will integrate the software for data acquisition, signal analysis, and outcome prediction into a single application suitable for use by medical personnel in the neonatal intensive care unit. The performance of APEX will later be prospectively validated in 50 additional infants.DiscussionThe results of this research will provide the quantitative evidence needed to assist clinicians in determining when to extubate a preterm infant with the highest probability of success, and could produce significant improvements in extubation outcomes in this population.Trial registrationClinicaltrials.gov identifier: NCT01909947. Registered on July 17 2013.Trial sponsor: Canadian Institutes of Health Research (CIHR).

2017-07-16

BMC Pediatrics (published)

A Semi-Markov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants

Charles C. Onu

Lara J. Kanbar

Wissam Shalish

Karen A. Brown

Guilherme M. Sant'Anna

Robert E. Kearney

After birth, extremely preterm infants often require specialized respiratory management in the form of invasive mechanical ventilation (IMV)… (see more). Protracted IMV is associated with detrimental outcomes and morbidities. Premature extubation, on the other hand, would necessitate reintubation which is risky, technically challenging and could further lead to lung injury or disease. We present an approach to modeling respiratory patterns of infants who succeeded extubation and those who required reintubation which relies on Markov models. We compare the use of traditional Markov chains to semi-Markov models which emphasize cross-pattern transitions and timing information, and to multi-chain Markov models which can concisely represent non-stationarity in respiratory behavior over time. The models we developed expose specific, unique similarities as well as vital differences between the two populations.

2017-07-10

2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (published)

APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies.

Lara J. Kanbar

Wissam Shalish

Karen A. Brown

Guilherme M. Sant'Anna

Robert E. Kearney

2017-06-30

Annual International Conference of the IEEE Engineering in Medicine and Biology Society (published)

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control

Policy gradient methods in reinforcement learning have become increasingly prevalent for state-of-the-art performance in continuous control … (see more)tasks. Novel methods typically benchmark against a few key algorithms such as deep deterministic policy gradients and trust region policy optimization. As such, it is important to present and use consistent baselines experiments. However, this can be difficult due to general variance in the algorithms, hyper-parameter tuning, and environment stochasticity. We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results. We provide guidelines on reporting novel results as comparisons against baseline methods such that future researchers can make informed decisions when investigating novel methods.

2017-06-16

ICML.cc/2017/RML (poster)

openreview.net

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

Jean Harb

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowl… (see more)edge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highlight also the importance of the optimization used in the training.

2017-04-17

ArXiv (preprint)

openreview.net

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

Peeyush T. Kumar

Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have som… (see more)ewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.

2017-03-18

ArXiv (preprint)