Publications

GRACE: A Language Model Framework for Explainable Inverse Reinforcement Learning

Silvia Sapora

R Devon Hjelm

Alexander T Toshev

Omar Attia

Inverse Reinforcement Learning aims to recover reward models from expert demonstrations, but traditional methods yield"black-box"models that… (see more) are difficult to interpret and debug. In this work, we introduce GRACE (Generating Rewards As CodE), a method for using Large Language Models within an evolutionary search to reverse-engineer an interpretable, code-based reward function directly from expert trajectories. The resulting reward function is executable code that can be inspected and verified. We empirically validate GRACE on the BabyAI and AndroidWorld benchmarks, where it efficiently learns highly accurate rewards, even in complex, multi-task settings. Further, we demonstrate that the resulting reward leads to strong policies, compared to both competitive Imitation Learning and online RL approaches with ground-truth rewards. Finally, we show that GRACE is able to build complex reward APIs in multi-task setups.

2025-10-01

ArXiv (preprint)

doi.org

arxiv.org

Attention-Based Multi-Agent RL for Multi-Machine Tending Using Mobile Robots

Abdalwhab Bakheet Mohamed Abdalwhab

Giovanni Beltrame

Samira Ebrahimi Kahou

David St-Onge

Robotics can help address the growing worker shortage challenge of the manufacturing industry. As such, machine tending is a task collaborat… (see more)ive robots can tackle that can also greatly boost productivity. Nevertheless, existing robotics systems deployed in that sector rely on a fixed single-arm setup, whereas mobile robots can provide more flexibility and scalability. We introduce a multi-agent multi-machine-tending learning framework using mobile robots based on multi-agent reinforcement learning (MARL) techniques, with the design of a suitable observation and reward. Moreover, we integrate an attention-based encoding mechanism into the Multi-Agent Proximal Policy Optimization (MAPPO) algorithm to boost its performance for machine-tending scenarios. Our model (AB-MAPPO) outperforms MAPPO in this new challenging scenario in terms of task success, safety, and resource utilization. Furthermore, we provided an extensive ablation study to support our design decisions.

2025-09-30

AI (published)

doi.org

Combining virtual reality and hypnosis to alleviate chronic pain in elderly with hand arthritis: protocol for a randomised phase II clinical trial

Valentyn Fournier

Marie-Fania Simard

Sai Yan Yuen

Joséphine Guiné

Floriane Rousseaux

Julie Lebeau

Karim Jerbi

Philippe Richebé

Mathieu Landry

Pierre Rainville

David Ogez

Chronic pain is a common health condition that significantly impacts the quality of life of those affected, affecting one in five people in … (see more)Canada. The prevalence of this condition tends to increase with age, making it a major health issue given the ageing population. However, its management remains inadequate and requires significant mobilisation of healthcare professionals as well as the development of multiple therapeutic solutions. Among these, non-pharmacological interventions such as hypnosis and virtual reality have proven effective. Nevertheless, while the existing literature seems promising, it presents methodological limitations. Therefore, this study aims to assess the effectiveness of an intervention combining virtual reality and hypnosis in an ageing population suffering from a widespread chronic pain condition, that is, hand arthritis. This study will be a single-centre randomised clinical trial. Participants will be randomly assigned to one of two conditions: one receiving an intervention combining virtual reality and hypnosis, and the other receiving only virtual reality. The effectiveness of the intervention on current perceived pain before and after the intervention (primary outcome) will be evaluated. Secondary outcomes will include anxiety and depressive symptoms, quality of life, relaxation and fatigue. Exploratory analyses will also be conducted to contribute to the emerging literature by examining physiological variables such as heart rate variability, respiratory rate and electrodermal activity during the intervention, and their relationship with primary and secondary outcomes. The project was approved by the Research Ethical Committee of the Hospital Maisonneuve-Rosemont (Project no 2024-3539). Participants will be asked to provide written consent for their participation. Results from this study will be shared through peer-reviewed publications, as well as oral and poster presentations at scientific events. The protocol for this study was preregistered on Open Science Framework and raw anonymised data will be available on this platform ( https://osf.io/vbh72/?view_only=1d17c5708f894faab6669d85e1fde75d ). NCT06833905 .

2025-09-30

BMJ Open (published)

doi.org

Current landscape of clinical genetics knowledge and attitudes among Non-Geneticist Physicians - the McGill genetics education survey (McGES).

Sarah Abdullah-Maklan

Yannis Trakadis

2025-09-30

Journal of Community Genetics (published)

doi.org

Equivariant Geometric Scattering Networks via Vector Diffusion Wavelets

David R. Johnson

Rishabh Anand

Smita Krishnaswamy

Michael Perlmutter

2025-09-30

ArXiv (preprint)

doi.org

arxiv.org

Intersecting perspectives: A participatory street review framework for urban inclusivity

Rashid Mushkani

Shin Koseki

2025-09-30

Habitat International (published)

doi.org

Predicting space use patterns of a territorial top predator: from individual movement decisions to Arctic fox space use

Frédéric Dulude-de Broin

Dominique Berteaux

Joël Bêty

Catherine Villeneuve

Alexis Grenier-Potvin

Andréanne Beardsell

Jeanne Clermont

Audrey Durand

Pierre Legagneux

2025-09-30

bioRxiv (preprint)

doi.org

"A Responsible Framework for Applying Artificial Intelligence on Medical Images and Signals at the Point of Care: The PACS-AI Platform [Canadian Journal of Cardiology Volume 40, Issue 10, October 2024, Pages 1828-1840]".

Pascal Thériault-Lauzier

Denis Corbin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

J. W. Gichoya

A. Chandar

Pierre-Luc Deziel

Julie G Hussin

Samuel Kadoury

Robert Avram

2025-09-30

Canadian Journal of Cardiology (published)

doi.org

Similarity-based transfer learning with deep learning networks for accurate CRISPR-Cas9 off-target prediction.

Jérémy Charlier

Zeinab Sherkatghanad

Vladimir Makarenkov

Transfer learning has emerged as a powerful tool for enhancing predictive accuracy in complex tasks, particularly in scenarios where data is… (see more) limited or imbalanced. This study explores the use of similarity-based pre-evaluation as a methodology to identify optimal source datasets for transfer learning, addressing the dual challenge of efficient source-target dataset pairing and off-target prediction in CRISPR-Cas9, while existing transfer learning applications in the field of gene editing often lack a principled method for source dataset selection. We use cosine, Euclidean, and Manhattan distances to evaluate similarity between the source and target datasets used in our transfer learning experiments. Four deep learning network architectures, i.e. Multilayer Perceptron (MLP), Convolutional Neural Networks (CNNs), Feedforward Neural Networks (FNNs), and Recurrent Neural Networks (RNNs), and two traditional machine learning models, i.e. Logistic Regression (LR) and Random Forest (RF), were tested and compared in our simulations. The results suggest that similarity scores are reliable indicators for pre-selecting source datasets in CRISPR-Cas9 transfer learning experiments, with cosine distance proving to be a more effective dataset comparison metric than either Euclidean or Manhattan distances. An RNN-GRU, a 5-layer FNN, and two MLP variants provided the best overall prediction results in our simulations. By integrating similarity-based source pre-selection with machine learning outcomes, we propose a dual-layered framework that not only streamlines the transfer learning process but also significantly improves off-target prediction accuracy. The code and data used in this study are freely available at: https://github.com/dagrate/transferlearning_offtargets .

2025-09-30

PLoS Computational Biology (published)

doi.org

The Three Regimes of Offline-to-Online Reinforcement Learning

Offline-to-online reinforcement learning (RL) has emerged as a practical paradigm that leverages offline datasets for pretraining and online… (see more) interactions for fine-tuning. However, its empirical behavior is highly inconsistent: design choices of online-fine tuning that work well in one setting can fail completely in another. We propose a stability--plasticity principle that can explain this inconsistency: we should preserve the knowledge of pretrained policy or offline dataset during online fine-tuning, whichever is better, while maintaining sufficient plasticity. This perspective identifies three regimes of online fine-tuning, each requiring distinct stability properties. We validate this framework through a large-scale empirical study, finding that the results strongly align with its predictions in 45 of 63 cases. This work provides a principled framework for guiding design choices in offline-to-online RL based on the relative performance of the offline dataset and the pretrained policy.

2025-09-30

arXiv (preprint)

doi.org

openreview.net

They Hear Me Rolling: Design and Characterization of a Distributed, Rolling Acoustic-Tactile Sensor

Wilfred Mason

David Brenken

Olivier St-Martin Cormier

Audrey Sedal

Tactile sensor design has been widely explored at the centimeter-scale; fewer explorations exist in larger scale systems with varied geometr… (see more)ies. We present a meter-scale tactile sensor for wheeled robotic platforms based on a flexible acoustic waveguide. This sensor architecture performs contact sensing over the surface of a rotating wheel with a single transducer that is separated from the sensing surface. The design and characterization of the sensor are presented, along with a demonstration of a state-estimation framework using tactile sensor feedback to measure surface features.

2025-09-30

IEEE Sensors Letters (published)

doi.org

VDW-GNNs: Vector diffusion wavelets for geometric graph neural networks

David R. Johnson

Alexander Sietsema

Rishabh Anand

Deanna Needell

Smita Krishnaswamy

Michael Perlmutter

We introduce vector diffusion wavelets (VDWs), a novel family of wavelets inspired by the vector diffusion maps algorithm that was introduce… (see more)d to analyze data lying in the tangent bundle of a Riemannian manifold. We show that these wavelets may be effectively incorporated into a family of geometric graph neural networks, which we refer to as VDW-GNNs. We demonstrate that such networks are effective on synthetic point cloud data, as well as on real-world data derived from wind-field measurements and neural activity data. Theoretically, we prove that these new wavelets have desirable frame theoretic properties, similar to traditional diffusion wavelets. Additionally, we prove that these wavelets have desirable symmetries with respect to rotations and translations.

2025-09-30

ArXiv (preprint)

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications