Portrait de Joelle Pineau

Joelle Pineau

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure agrégée, McGill University, École d'informatique
Co-directrice générale, Meta AI (FAIR - Facebook AI Research)
Sujets de recherche
Apprentissage automatique médical
Apprentissage par renforcement
Traitement du langage naturel

Biographie

Joelle Pineau est professeure agrégée et titulaire d’une bourse William Dawson à l'Université McGill, où elle codirige le Laboratoire de raisonnement et d'apprentissage. Elle est membre du corps professoral de Mila – Institut québécois d’intelligence artificielle et titulaire d'une chaire en IA Canada-CIFAR. Elle est également vice-présidente de la recherche en IA chez Meta (anciennement Facebook), où elle dirige l'équipe FAIR (Fundamental AI Research). Elle détient un baccalauréat ès sciences en génie de l'Université de Waterloo et une maîtrise et un doctorat en robotique de l'Université Carnegie Mellon.

Ses recherches sont axées sur le développement de nouveaux modèles et algorithmes pour la planification et l'apprentissage dans des domaines complexes partiellement observables. Elle travaille également sur l'application de ces algorithmes à des problèmes complexes en robotique, dans les soins de santé, dans les jeux et dans les agents conversationnels. Elle est membre du comité de rédaction du Journal of Artificial Intelligence Research et du Journal of Machine Learning Research, et est actuellement présidente de l'International Machine Learning Society. Elle a été lauréate de la bourse commémorative E. W. R. Steacie du Conseil de recherches en sciences naturelles et en génie (CRSNG) 2018 et du Prix du Gouverneur général pour l'innovation 2019. Elle est membre de l'Association pour l'avancement de l'intelligence artificielle (AAAI), membre principal de l'Institut canadien de recherches avancées (CIFAR) et membre de la Société royale du Canada.

Étudiants actuels

Stagiaire de recherche - Université de Montréal
Doctorat - UdeM
Superviseur⋅e principal⋅e :
Doctorat - McGill
Co-superviseur⋅e :
Doctorat - McGill
Stagiaire de recherche - McGill
Stagiaire de recherche - UdeM

Publications

SPeCiaL: Self-Supervised Pretraining for Continual Learning
Lucas Caccia
A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions
Anthony GX-Chen
Veronica Chelu
Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Rehabilitation: Protocol for Evaluation and Rehabilitation of Mobility Across Continuums of Care
Sara Ahmed
Philippe Archambault
Claudine Auger
Joyce Fung
Eva Kehayia
Anouk Lamontagne
Annette Majnemer
Sylvie Nadeau
Alain Ptito
Bonnie Swaine
Background Rapid advances in technologies over the past 10 years have enabled large-scale biomedical and psychosocial rehabilitation researc… (voir plus)h to improve the function and social integration of persons with physical impairments across the lifespan. The Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies (BRILLIANT) in community mobility rehabilitation aims to generate evidence-based research to improve rehabilitation for individuals with acquired brain injury (ABI). Objective This study aims to (1) identify the factors limiting or enhancing mobility in real-world community environments (public spaces, including the mall, home, and outdoors) and understand their complex interplay in individuals of all ages with ABI and (2) customize community environment mobility training by identifying, on a continuous basis, the specific rehabilitation strategies and interventions that patient subgroups benefit from most. Here, we present the research and technology plan for the BRILLIANT initiative. Methods A cohort of individuals, adults and children, with ABI (N=1500) will be recruited. Patients will be recruited from the acute care and rehabilitation partner centers within 4 health regions (living labs) and followed throughout the continuum of rehabilitation. Participants will also be recruited from the community. Biomedical, clinician-reported, patient-reported, and brain imaging data will be collected. Theme 1 will implement and evaluate the feasibility of collecting data across BRILLIANT living labs and conduct predictive analyses and artificial intelligence (AI) to identify mobility subgroups. Theme 2 will implement, evaluate, and identify community mobility interventions that optimize outcomes for mobility subgroups of patients with ABI. Results The biomedical infrastructure and equipment have been established across the living labs, and development of the clinician- and patient-reported outcome digital solutions is underway. Recruitment is expected to begin in May 2022. Conclusions The program will develop and deploy a comprehensive clinical and community-based mobility-monitoring system to evaluate the factors that result in poor mobility, and develop personalized mobility interventions that are optimized for specific patient subgroups. Technology solutions will be designed to support clinicians and patients to deliver cost-effective care and the right intervention to the right person at the right time to optimize long-term functional potential and meaningful participation in the community. International Registered Report Identifier (IRRID) PRR1-10.2196/12506
Block Contextual MDPs for Continual Learning
Shagun Sodhani
Franziska Meier
Amy Zhang
In reinforcement learning (RL), when defining a Markov Decision Process (MDP), the environment dynamics is implicitly assumed to be stationa… (voir plus)ry. This assumption of stationarity, while simplifying, can be unrealistic in many scenarios. In the continual reinforcement learning scenario, the sequence of tasks is another source of nonstationarity. In this work, we propose to examine this continual reinforcement learning setting through the Block Contextual MDP (BC-MDP) framework, which enables us to relax the assumption of stationarity. This framework challenges RL algorithms to handle both nonstationarity and rich observation settings and, by additionally leveraging smoothness properties, enables us to study generalization bounds for this setting. Finally, we take inspiration from adaptive control to propose a novel algorithm that addresses the challenges introduced by this more realistic BC-MDP setting, allows for zero-shot adaptation at evaluation time, and achieves strong performance on several nonstationary environments.
Improving Passage Retrieval with Zero-Shot Question Generation
Devendra Singh Sachan
Mike Lewis
Mandar S. Joshi
Armen Aghajanyan
Wen-292 Tau Yih
Luke Zettlemoyer
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retr… (voir plus)ieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.
Robust Policy Learning over Multiple Uncertainty Sets
Annie Xie
Shagun Sodhani
Chelsea Finn
Amy Zhang
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods prov… (voir plus)ide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.
Estimating causal effects with optimization-based methods: A review and empirical comparison
Martin Cousineau
Vedat Verter
S. Murphy
New Insights on Reducing Abrupt Representation Change in Online Continual Learning
Lucas Caccia
Rahaf Aljundi
Nader Asadi
Tinne Tuytelaars
In the online continual learning paradigm, agents must learn from a changing distribution while respecting memory and compute constraints. E… (voir plus)xperience Replay (ER), where a small subset of past data is stored and replayed alongside new data, has emerged as a simple and effective learning strategy. In this work, we focus on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream, and new classes must be distinguished from previous ones. We shed new light on this question by showing that applying ER causes the newly added classes’ representations to overlap significantly with the previous classes, leading to highly disruptive parameter updates. Based on this empirical analysis, we propose a new method which mitigates this issue by shielding the learned representations from drastic adaptation to accommodate new classes. We show that using an asymmetric update rule pushes new classes to adapt to the older ones (rather than the reverse), which is more effective especially at task boundaries, where much of the forgetting typically occurs. Empirical results show significant gains over strong baselines on standard continual learning benchmarks.
Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Rehabilitation: Protocol for Evaluation and Rehabilitation of Mobility Across Continuums of Care (Preprint)
Sara Ahmed
Philippe Archambault
Claudine Auger
Joyce Fung
Eva Kehayia
Anouk Lamontagne
Annette Majnemer
Sylvie Nadeau
Alain Ptito
Bonnie Swaine
Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Rehabilitation: Protocol for Evaluation and Rehabilitation of Mobility Across Continuums of Care
Sara Ahmed
P. Archambault
Claudine Auger
Joyce Phua Pau Fung
Eva Kehayia
Anouk Lamontagne
Annette Majnemer
Sylvie Nadeau
Alain Ptito
B. Swaine
Background Rapid advances in technologies over the past 10 years have enabled large-scale biomedical and psychosocial rehabilitation researc… (voir plus)h to improve the function and social integration of persons with physical impairments across the lifespan. The Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies (BRILLIANT) in community mobility rehabilitation aims to generate evidence-based research to improve rehabilitation for individuals with acquired brain injury (ABI). Objective This study aims to (1) identify the factors limiting or enhancing mobility in real-world community environments (public spaces, including the mall, home, and outdoors) and understand their complex interplay in individuals of all ages with ABI and (2) customize community environment mobility training by identifying, on a continuous basis, the specific rehabilitation strategies and interventions that patient subgroups benefit from most. Here, we present the research and technology plan for the BRILLIANT initiative. Methods A cohort of individuals, adults and children, with ABI (N=1500) will be recruited. Patients will be recruited from the acute care and rehabilitation partner centers within 4 health regions (living labs) and followed throughout the continuum of rehabilitation. Participants will also be recruited from the community. Biomedical, clinician-reported, patient-reported, and brain imaging data will be collected. Theme 1 will implement and evaluate the feasibility of collecting data across BRILLIANT living labs and conduct predictive analyses and artificial intelligence (AI) to identify mobility subgroups. Theme 2 will implement, evaluate, and identify community mobility interventions that optimize outcomes for mobility subgroups of patients with ABI. Results The biomedical infrastructure and equipment have been established across the living labs, and development of the clinician- and patient-reported outcome digital solutions is underway. Recruitment is expected to begin in May 2022. Conclusions The program will develop and deploy a comprehensive clinical and community-based mobility-monitoring system to evaluate the factors that result in poor mobility, and develop personalized mobility interventions that are optimized for specific patient subgroups. Technology solutions will be designed to support clinicians and patients to deliver cost-effective care and the right intervention to the right person at the right time to optimize long-term functional potential and meaningful participation in the community. International Registered Report Identifier (IRRID) PRR1-10.2196/12506
Biomedical Research & Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Rehabilitation: Protocol for a longitudinal evaluation of mobility outcomes (Preprint)
Sara Ahmed
Philippe Archambault
Claudine Auger
Joyce Fung
Eva Kehayia
Anouk Lamontagne
Annette Majnemer
Sylvie Nadeau
Alain Ptito
Bonnie Swaine
UNSTRUCTURED The Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Re… (voir plus)habilitation (BRILLIANT) program to provide evidence-based research to improve rehabilitation for individuals with Acquired Brain Injury (ABI: traumatic brain injury [TBI], cerebral palsy-fetal/perinatal brain injury, and stroke). The vision of the BRILLIANT program is to optimize mobility of persons with ABI across the lifespan. The program will develop and deploy a comprehensive clinical and community based mobility monitoring system to evaluate the factors that result in poor mobility, and develop personalized mobility interventions that are optimized for specific patient sub-groups. These innovations will be used by front-line clinicians to deliver cost-effective care; the right intervention to the right person at the right time, accounting for long-term functional potential and meaningful participation in the community.
A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions
Anthony GX-Chen
Veronica Chelu
Estimating value functions is a core component of reinforcement learning algorithms. Temporal difference (TD) learning algorithms use bootst… (voir plus)rapping, i.e. they update the value function toward a learning target using value estimates at subsequent time-steps. Alternatively, the value function can be updated toward a learning target constructed by separately predicting successor features (SF)—a policy-dependent model—and linearly combining them with instantaneous rewards. We focus on bootstrapping targets used when estimating value functions, and propose a new backup target, the ?-return mixture, which implicitly combines value-predictive knowledge (used by TD methods) with (successor) feature-predictive knowledge—with a parameter ? capturing how much to rely on each. We illustrate that incorporating predictive knowledge through an ??-discounted SF model makes more efficient use of sampled experience, compared to either extreme, i.e. bootstrapping entirely on the value function estimate, or bootstrapping on the product of separately estimated successor features and instantaneous reward models. We empirically show this approach leads to faster policy evaluation and better control performance, for tabular and nonlinear function approximations, indicating scalability and generality.