Portrait de Doina Precup

Doina Precup

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure agrégée, McGill University, École d'informatique
Chef d'équipe de recherche, Google DeepMind
Sujets de recherche
Apprentissage automatique médical
Apprentissage par renforcement
Modèles probabilistes
Modélisation moléculaire
Raisonnement

Biographie

Doina Precup enseigne à l'Université McGill tout en menant des recherches fondamentales sur l'apprentissage par renforcement, notamment les applications de l'IA dans des domaines ayant des répercussions sociales, tels que les soins de santé. Elle s'intéresse à la prise de décision automatique dans des situations d'incertitude élevée.

Elle est membre de l'Institut canadien de recherches avancées (CIFAR) et de l'Association pour l'avancement de l'intelligence artificielle (AAAI), et dirige le bureau montréalais de DeepMind.

Ses spécialités sont les suivantes : intelligence artificielle, apprentissage machine, apprentissage par renforcement, raisonnement et planification sous incertitude, applications.

Étudiants actuels

Stagiaire de recherche - McGill
Collaborateur·rice alumni - McGill
Co-superviseur⋅e :
Collaborateur·rice alumni - McGill
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - McGill
Co-superviseur⋅e :
Collaborateur·rice de recherche - UdeM
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - Birla Institute of Technology
Maîtrise recherche - McGill
Doctorat - McGill
Collaborateur·rice alumni - McGill
Maîtrise recherche - McGill
Doctorat - Polytechnique
Postdoctorat - McGill
Collaborateur·rice alumni - McGill
Collaborateur·rice alumni - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Collaborateur·rice alumni - McGill
Maîtrise recherche - McGill
Superviseur⋅e principal⋅e :
Collaborateur·rice de recherche - McGill
Co-superviseur⋅e :
Doctorat - UdeM
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Stagiaire de recherche - McGill
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Stagiaire de recherche - McGill
Doctorat - McGill
Maîtrise recherche - McGill
Co-superviseur⋅e :
Doctorat - McGill
Superviseur⋅e principal⋅e :
Doctorat - McGill
Collaborateur·rice alumni - McGill
Co-superviseur⋅e :

Publications

Ubenwa: Cry-based Diagnosis of Birth Asphyxia
Innocent Udeogu
Eyenimi Ndiomu
Urbain Kengni
Guilherme M. Sant’Anna
E. Alikor
P. Opara
Every year, 3 million newborns die within the first month of life. Birth asphyxia and other breathing-related conditions are a leading cause… (voir plus) of mortality during the neonatal phase. Current diagnostic methods are too sophisticated in terms of equipment, required expertise, and general logistics. Consequently, early detection of asphyxia in newborns is very difficult in many parts of the world, especially in resource-poor settings. We are developing a machine learning system, dubbed Ubenwa, which enables diagnosis of asphyxia through automated analysis of the infant cry. Deployed via smartphone and wearable technology, Ubenwa will drastically reduce the time, cost and skill required to make accurate and potentially life-saving diagnoses.
Neural Network Based Nonlinear Weighted Finite Automata
Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent succ… (voir plus)esses of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.
Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation
Andrew Doyle
Douglas Arnold
Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation
Jesús Alejandro Cárdenes Cabré
Ricardo Sanz
Over-booking or under-booking of computing resources leads to higher cost and performance degradation of web applications. To optimize the p… (voir plus)erformance of web applications, access to the resources has to be dynamically controlled ensuring maximum cost-performance ratio of the application while fulfilling requirements. To simplify the design of dynamic cloud controllers, we propose a horizontal and vertical scalability self-aware agent defined by a self-adaptive fuzzy logic with an oriented random optimizer based on reward and memory. The algorithm dynamically adjusts the membership functions and their relationship, maximizing the reward of the system while considering the cost related to the deployment of new resources. The evaluation of the controller under real cloud workload reveals the ability of the algorithm to maximize the performance of the web application based on the target parameters given by an operator.
World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions
Teng Long
Jackie CK Cheung
Humans interpret texts with respect to some background information, or world knowledge, and we would like to develop automatic reading compr… (voir plus)ehension systems that can do the same. In this paper, we introduce a task and several models to drive progress towards this goal. In particular, we propose the task of rare entity prediction: given a web document with several entities removed, models are tasked with predicting the correct missing entities conditioned on the document context and the lexical resources. This task is challenging due to the diversity of language styles and the extremely large number of rare entities. We propose two recurrent neural network architectures which make use of external knowledge in the form of entity descriptions. Our experiments show that our hierarchical LSTM model performs significantly better at the rare entity prediction task than those that do not make use of external resources.
Approximate Value Iteration with Temporally Extended Actions (Extended Abstract)
Timothy A. Mann
Shie Mannor
The options framework provides a concrete way to implement and reason about temporally extended actions. Existing literature has demonstrate… (voir plus)d the value of planning with options empirically, but there is a lack of theoretical analysis formalizing when planning with options is more efficient than planning with primitive actions. We provide a general analysis of the convergence rate of a popular Approximate Value Iteration (AVI) algorithm called Fitted Value Iteration (FVI) with options. Our analysis reveals that longer duration options and a pessimistic estimate of the value function both lead to faster convergence. Furthermore, options can improve convergence even when they are suboptimal and sparsely distributed throughout the state space. Next we consider generating useful options for planning based on a subset of landmark states. This suggests a new algorithm, Landmarkbased AVI (LAVI), that represents the value function only at landmark states. We analyze OFVI and LAVI using the proposed landmark-based options and compare the two algorithms. Our theoretical and experimental results demonstrate that options can play an important role in AVI by decreasing approximation error and inducing fast convergence.
Prediction of Extubation readiness in extremely preterm infants by the automated analysis of cardiorespiratory behavior: study protocol
Wissam Shalish
Lara J. Kanbar
Smita Rao
Carlos A. Robles-Rubio
Lajos Kovacs
Sanjay Chawla
Martin Keszler
Karen Brown
Robert E. Kearney
Guilherme M. Sant’Anna
BackgroundExtremely preterm infants (≤ 28 weeks gestation) commonly require endotracheal intubation and mechanical ventilation (MV) to ma… (voir plus)intain adequate oxygenation and gas exchange. Given that MV is independently associated with important adverse outcomes, efforts should be made to limit its duration. However, current methods for determining extubation readiness are inaccurate and a significant number of infants fail extubation and require reintubation, an intervention that may be associated with increased morbidities. A variety of objective measures have been proposed to better define the optimal time for extubation, but none have proven clinically useful. In a pilot study, investigators from this group have shown promising results from sophisticated, automated analyses of cardiorespiratory signals as a predictor of extubation readiness. The aim of this study is to develop an automated predictor of extubation readiness using a combination of clinical tools along with novel and automated measures of cardiorespiratory behavior, to assist clinicians in determining when extremely preterm infants are ready for extubation.MethodsIn this prospective, multicenter observational study, cardiorespiratory signals will be recorded from 250 eligible extremely preterm infants with birth weights ≤1250 g immediately prior to their first planned extubation. Automated signal analysis algorithms will compute a variety of metrics for each infant, and machine learning methods will then be used to find the optimal combination of these metrics together with clinical variables that provide the best overall prediction of extubation readiness. Using these results, investigators will develop an Automated system for Prediction of EXtubation (APEX) readiness that will integrate the software for data acquisition, signal analysis, and outcome prediction into a single application suitable for use by medical personnel in the neonatal intensive care unit. The performance of APEX will later be prospectively validated in 50 additional infants.DiscussionThe results of this research will provide the quantitative evidence needed to assist clinicians in determining when to extubate a preterm infant with the highest probability of success, and could produce significant improvements in extubation outcomes in this population.Trial registrationClinicaltrials.gov identifier: NCT01909947. Registered on July 17 2013.Trial sponsor: Canadian Institutes of Health Research (CIHR).
A Semi-Markov Chain Approach to Modeling Respiratory Patterns Prior to Extubation in Preterm Infants
Charles C. Onu
Lara J. Kanbar
Wissam Shalish
Karen A. Brown
Guilherme M. Sant'Anna
Robert E. Kearney
After birth, extremely preterm infants often require specialized respiratory management in the form of invasive mechanical ventilation (IMV)… (voir plus). Protracted IMV is associated with detrimental outcomes and morbidities. Premature extubation, on the other hand, would necessitate reintubation which is risky, technically challenging and could further lead to lung injury or disease. We present an approach to modeling respiratory patterns of infants who succeeded extubation and those who required reintubation which relies on Markov models. We compare the use of traditional Markov chains to semi-Markov models which emphasize cross-pattern transitions and timing information, and to multi-chain Markov models which can concisely represent non-stationarity in respiratory behavior over time. The models we developed expose specific, unique similarities as well as vital differences between the two populations.
APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies.
Lara J. Kanbar
Wissam Shalish
Karen A. Brown
Guilherme M. Sant'Anna
Robert E. Kearney
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
Policy gradient methods in reinforcement learning have become increasingly prevalent for state-of-the-art performance in continuous control … (voir plus)tasks. Novel methods typically benchmark against a few key algorithms such as deep deterministic policy gradients and trust region policy optimization. As such, it is important to present and use consistent baselines experiments. However, this can be difficult due to general variance in the algorithms, hyper-parameter tuning, and environment stochasticity. We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results. We provide guidelines on reporting novel results as comparisons against baseline methods such that future researchers can make informed decisions when investigating novel methods.
Investigating Recurrence and Eligibility Traces in Deep Q-Networks
Jean Harb
Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowl… (voir plus)edge back over time-steps in a single update. We investigate the use of eligibility traces in combination with recurrent networks in the Atari domain. We illustrate the benefits of both recurrent nets and eligibility traces in some Atari games, and highlight also the importance of the optimization used in the training.
Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options
Peeyush T. Kumar
Deliberating on large or continuous state spaces have been long standing challenges in reinforcement learning. Temporal Abstraction have som… (voir plus)ewhat made this possible, but efficiently planing using temporal abstraction still remains an issue. Moreover using spatial abstractions to learn policies for various situations at once while using temporal abstraction models is an open problem. We propose here an efficient algorithm which is convergent under linear function approximation while planning using temporally abstract actions. We show how this algorithm can be used along with randomly generated option models over multiple time scales to plan agents which need to act real time. Using these randomly generated option models over multiple time scales are shown to reduce number of decision epochs required to solve the given task, hence effectively reducing the time needed for deliberation.