Joelle Pineau

Alain Ptito

B. Swaine

Background Rapid advances in technologies over the past 10 years have enabled large-scale biomedical and psychosocial rehabilitation researc… (see more)h to improve the function and social integration of persons with physical impairments across the lifespan. The Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies (BRILLIANT) in community mobility rehabilitation aims to generate evidence-based research to improve rehabilitation for individuals with acquired brain injury (ABI). Objective This study aims to (1) identify the factors limiting or enhancing mobility in real-world community environments (public spaces, including the mall, home, and outdoors) and understand their complex interplay in individuals of all ages with ABI and (2) customize community environment mobility training by identifying, on a continuous basis, the specific rehabilitation strategies and interventions that patient subgroups benefit from most. Here, we present the research and technology plan for the BRILLIANT initiative. Methods A cohort of individuals, adults and children, with ABI (N=1500) will be recruited. Patients will be recruited from the acute care and rehabilitation partner centers within 4 health regions (living labs) and followed throughout the continuum of rehabilitation. Participants will also be recruited from the community. Biomedical, clinician-reported, patient-reported, and brain imaging data will be collected. Theme 1 will implement and evaluate the feasibility of collecting data across BRILLIANT living labs and conduct predictive analyses and artificial intelligence (AI) to identify mobility subgroups. Theme 2 will implement, evaluate, and identify community mobility interventions that optimize outcomes for mobility subgroups of patients with ABI. Results The biomedical infrastructure and equipment have been established across the living labs, and development of the clinician- and patient-reported outcome digital solutions is underway. Recruitment is expected to begin in May 2022. Conclusions The program will develop and deploy a comprehensive clinical and community-based mobility-monitoring system to evaluate the factors that result in poor mobility, and develop personalized mobility interventions that are optimized for specific patient subgroups. Technology solutions will be designed to support clinicians and patients to deliver cost-effective care and the right intervention to the right person at the right time to optimize long-term functional potential and meaningful participation in the community. International Registered Report Identifier (IRRID) PRR1-10.2196/12506

2022-01-13

JMIR Research Protocols (published)

Sara Ahmed

Philippe Archambault

Claudine Auger

Audrey Durand

Joyce Fung

Eva Kehayia

Anouk Lamontagne

Annette Majnemer

Sylvie Nadeau

Alain Ptito

Bonnie Swaine

2022-01-13

(published)

Biomedical Research & Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Rehabilitation: Protocol for a longitudinal evaluation of mobility outcomes (Preprint)

Sara Ahmed

Philippe Archambault

Claudine Auger

Audrey Durand

Joyce Fung

Eva Kehayia

Anouk Lamontagne

Annette Majnemer

Sylvie Nadeau

Alain Ptito

Bonnie Swaine

UNSTRUCTURED The Biomedical Research and Informatics Living Laboratory for Innovative Advances of New Technologies in Community Mobility Re… (see more)habilitation (BRILLIANT) program to provide evidence-based research to improve rehabilitation for individuals with Acquired Brain Injury (ABI: traumatic brain injury [TBI], cerebral palsy-fetal/perinatal brain injury, and stroke). The vision of the BRILLIANT program is to optimize mobility of persons with ABI across the lifespan. The program will develop and deploy a comprehensive clinical and community based mobility monitoring system to evaluate the factors that result in poor mobility, and develop personalized mobility interventions that are optimized for specific patient sub-groups. These innovations will be used by front-line clinicians to deliver cost-effective care; the right intervention to the right person at the right time, accounting for long-term functional potential and meaningful participation in the community.

2022-01-13

(published)

A Generalized Bootstrap Target for Value-Learning, Efficiently Combining Value and Feature Predictions

Anthony GX-Chen

Veronica Chelu

Blake Richards

Estimating value functions is a core component of reinforcement learning algorithms. Temporal difference (TD) learning algorithms use bootst… (see more)rapping, i.e. they update the value function toward a learning target using value estimates at subsequent time-steps. Alternatively, the value function can be updated toward a learning target constructed by separately predicting successor features (SF)—a policy-dependent model—and linearly combining them with instantaneous rewards. We focus on bootstrapping targets used when estimating value functions, and propose a new backup target, the ?-return mixture, which implicitly combines value-predictive knowledge (used by TD methods) with (successor) feature-predictive knowledge—with a parameter ? capturing how much to rely on each. We illustrate that incorporating predictive knowledge through an ??-discounted SF model makes more efficient use of sampled experience, compared to either extreme, i.e. bootstrapping entirely on the value function estimate, or bootstrapping on the product of separately estimated successor features and instantaneous reward models. We empirically show this approach leads to faster policy evaluation and better control performance, for tabular and nonlinear function approximations, indicating scalability and generality.

2022-01-05

ArXiv (preprint)

Robust Policy Learning over Multiple Uncertainty Sets

Annie Xie

Shagun Sodhani

Chelsea Finn

Amy Zhang

Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods prov… (see more)ide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.

2022-01-01

ICML (published)

proceedings.mlr.press

The Curious Case of Absolute Position Embeddings

Koustuv Sinha

Amirhossein Kazemnejad

Siva Reddy

Dieuwke Hupkes

Adina Williams

2022-01-01

EMNLP (Findings) (published)

Automated Data-Driven Generation of Personalized Pedagogical Interventions in Intelligent Tutoring Systems

Ekaterina Kochmar

Dung D. Vu

Robert Belfer

Varun Gupta

Iulian V. Serban

2021-07-27

International Journal of Artificial Intelligence in Education (published)

Automated Data-Driven Generation of Personalized Pedagogical Interventions in Intelligent Tutoring Systems

Ekaterina Kochmar

Dung D. Vu

Robert Belfer

Varun Gupta

Iulian V. Serban

2021-07-27

International Journal of Artificial Intelligence in Education (published)

SPeCiaL: Self-Supervised Pretraining for Continual Learning

Lucas Caccia

2021-06-16

ArXiv (preprint)

Correcting Momentum in Temporal Difference Learning

Emmanuel Bengio

Doina Precup

A common optimization tool used in deep reinforcement learning is momentum, which consists in accumulating and discounting past gradients, r… (see more)eapplying them at each iteration. We argue that, unlike in supervised learning, momentum in Temporal Difference (TD) learning accumulates gradients that become doubly stale: not only does the gradient of the loss change due to parameter updates, the loss itself changes due to bootstrapping. We first show that this phenomenon exists, and then propose a first-order correction term to momentum. We show that this correction term improves sample efficiency in policy evaluation by correcting target value drift. An important insight of this work is that deep RL methods are not always best served by directly importing techniques from the supervised setting.

2021-06-07

ArXiv (preprint)

openreview.net

Model-Invariant State Abstractions for Model-Based Reinforcement Learning

Manan Tomar

Amy Zhang

Roberto Calandra

Matthew E. Taylor

Accuracy and generalization of dynamics models is key to the success of model-based reinforcement learning (MBRL). As the complexity of task… (see more)s increases, so does the sample inefficiency of learning accurate dynamics models. However, many complex tasks also exhibit sparsity in the dynamics, i.e., actions have only a local effect on the system dynamics. In this paper, we exploit this property with a causal invariance perspective in the single-task setting, introducing a new type of state abstraction called \textit{model-invariance}. Unlike previous forms of state abstractions, a model-invariance state abstraction leverages causal sparsity over state variables. This allows for compositional generalization to unseen states, something that non-factored forms of state abstractions cannot do. We prove that an optimal policy can be learned over this model-invariance state abstraction and show improved generalization in a simple toy domain. Next, we propose a practical method to approximately learn a model-invariant representation for complex domains and validate our approach by showing improved modelling performance over standard maximum likelihood approaches on challenging tasks, such as the MuJoCo-based Humanoid. Finally, within the MBRL setting we show strong performance gains with respect to sample efficiency across a host of other continuous control tasks.

2021-02-19

ArXiv (preprint)

openreview.net

Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)

Philippe Vincent‐lamarre

Koustuv Sinha

Vincent Larivière

Alina Beygelzimer

Florence D'alche-buc

E. Fox

Hugo Larochelle