Publications

"A Responsible Framework for Applying Artificial Intelligence on Medical Images and Signals at the Point of Care: The PACS-AI Platform [Canadian Journal of Cardiology Volume 40, Issue 10, October 2024, Pages 1828-1840]".

Pascal Thériault-Lauzier

Denis Corbin

Olivier Tastet

Élodie Labrecque Langlais

B. Taji

Guson Kang

A. Chong

Derek So

An Tang

J. W. Gichoya

A. Chandar

Pierre-Luc Deziel

Julie G Hussin

Samuel Kadoury

Robert Avram

2025-09-30

Canadian Journal of Cardiology (published)

doi.org

Similarity-based transfer learning with deep learning networks for accurate CRISPR-Cas9 off-target prediction.

Jérémy Charlier

Zeinab Sherkatghanad

Vladimir Makarenkov

Transfer learning has emerged as a powerful tool for enhancing predictive accuracy in complex tasks, particularly in scenarios where data is… (see more) limited or imbalanced. This study explores the use of similarity-based pre-evaluation as a methodology to identify optimal source datasets for transfer learning, addressing the dual challenge of efficient source-target dataset pairing and off-target prediction in CRISPR-Cas9, while existing transfer learning applications in the field of gene editing often lack a principled method for source dataset selection. We use cosine, Euclidean, and Manhattan distances to evaluate similarity between the source and target datasets used in our transfer learning experiments. Four deep learning network architectures, i.e. Multilayer Perceptron (MLP), Convolutional Neural Networks (CNNs), Feedforward Neural Networks (FNNs), and Recurrent Neural Networks (RNNs), and two traditional machine learning models, i.e. Logistic Regression (LR) and Random Forest (RF), were tested and compared in our simulations. The results suggest that similarity scores are reliable indicators for pre-selecting source datasets in CRISPR-Cas9 transfer learning experiments, with cosine distance proving to be a more effective dataset comparison metric than either Euclidean or Manhattan distances. An RNN-GRU, a 5-layer FNN, and two MLP variants provided the best overall prediction results in our simulations. By integrating similarity-based source pre-selection with machine learning outcomes, we propose a dual-layered framework that not only streamlines the transfer learning process but also significantly improves off-target prediction accuracy. The code and data used in this study are freely available at: https://github.com/dagrate/transferlearning_offtargets .

2025-09-30

PLoS Computational Biology (published)

doi.org

The Three Regimes of Offline-to-Online Reinforcement Learning

Li Li

Tianwei Ni

Yihao Sun

Pierre-Luc Bacon

Offline-to-online reinforcement learning (RL) has emerged as a practical paradigm that leverages offline datasets for pretraining and online… (see more) interactions for fine-tuning. However, its empirical behavior is highly inconsistent: design choices of online-fine tuning that work well in one setting can fail completely in another. We propose a stability--plasticity principle that can explain this inconsistency: we should preserve the knowledge of pretrained policy or offline dataset during online fine-tuning, whichever is better, while maintaining sufficient plasticity. This perspective identifies three regimes of online fine-tuning, each requiring distinct stability properties. We validate this framework through a large-scale empirical study, finding that the results strongly align with its predictions in 45 of 63 cases. This work provides a principled framework for guiding design choices in offline-to-online RL based on the relative performance of the offline dataset and the pretrained policy.

2025-09-30

ArXiv (preprint)

doi.org

arxiv.org

They Hear Me Rolling: Design and Characterization of a Distributed, Rolling Acoustic-Tactile Sensor

Wilfred Mason

David Brenken

Olivier St-Martin Cormier

Audrey Sedal

Tactile sensor design has been widely explored at the centimeter-scale; fewer explorations exist in larger scale systems with varied geometr… (see more)ies. We present a meter-scale tactile sensor for wheeled robotic platforms based on a flexible acoustic waveguide. This sensor architecture performs contact sensing over the surface of a rotating wheel with a single transducer that is separated from the sensing surface. The design and characterization of the sensor are presented, along with a demonstration of a state-estimation framework using tactile sensor feedback to measure surface features.

2025-09-30

IEEE Sensors Letters (published)

doi.org

VDW-GNNs: Vector diffusion wavelets for geometric graph neural networks

David R. Johnson

Alexander Sietsema

Rishabh Anand

Deanna Needell

Smita Krishnaswamy

Michael Perlmutter

We introduce vector diffusion wavelets (VDWs), a novel family of wavelets inspired by the vector diffusion maps algorithm that was introduce… (see more)d to analyze data lying in the tangent bundle of a Riemannian manifold. We show that these wavelets may be effectively incorporated into a family of geometric graph neural networks, which we refer to as VDW-GNNs. We demonstrate that such networks are effective on synthetic point cloud data, as well as on real-world data derived from wind-field measurements and neural activity data. Theoretically, we prove that these new wavelets have desirable frame theoretic properties, similar to traditional diffusion wavelets. Additionally, we prove that these wavelets have desirable symmetries with respect to rotations and translations.

2025-09-30

ArXiv (preprint)

arxiv.org

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

Atif Belal

Heitor Rapela Medeiros

Marco Pedersoli

Eric Granger

2025-09-30

ArXiv (preprint)

doi.org

arxiv.org

A brain-inspired agentic architecture to improve planning with LLMs

Taylor Webb

Shanka Subhra Mondal

Ida Momennejad

Large language models (LLMs) demonstrate impressive performance on a wide variety of tasks, but they often struggle with tasks that require … (see more)multi-step reasoning or goal-directed planning. To address this, we take inspiration from the human brain, in which planning is accomplished via component processes that are predominantly associated with specific brain regions. These processes include conflict monitoring, state prediction, state evaluation, task decomposition, and task coordination. We find that LLMs are often capable of carrying out these functions in isolation, but struggle to autonomously coordinate them in the service of a goal. Therefore, we propose a modular agentic architecture - the Modular Agentic Planner (MAP) - in which planning is performed via the interaction of specialized brain-inspired LLM modules. We evaluate MAP on three challenging planning tasks – graph traversal, Tower of Hanoi, and the PlanBench benchmark – as well as an NLP task requiring multi-step reasoning (strategyQA). We find that MAP yields significant improvements over both standard LLM methods and competitive agentic baselines, can be effectively combined with smaller and more cost-efficient LLMs, and displays superior transfer across tasks. These results demonstrate the benefit of utilizing knowledge from cognitive neuroscience to improve planning in LLMs. Multi-step planning is a challenge for LLMs. Here, the authors introduce a brain-inspired Modular Agentic Planner that decomposes planning into specialized LLM modules, improving performance across tasks and highlighting the value of cognitive neuroscience for LLM design.

2025-09-29

Nature Communications (published)

doi.org

DeepCodeProbe: Evaluating Code Representation Quality in Models Trained on Code

Vahid Majdinasab

Amin Nikanjam

Foutse Khomh

2025-09-29

Empirical Software Engineering (published)

doi.org

DRBench: A Realistic Benchmark for Enterprise Deep Research

Amirhossein Abaskohi

Tianyi Chen

Miguel Muñoz-Mármol

Curtis Fox

Amrutha Varshini Ramesh

Étienne Marcotte

Christopher Pal

Issam Hadj Laradji

We introduce DRBench, a benchmark for evaluating AI agents on complex, open-ended deep research tasks in enterprise settings. Unlike prior b… (see more)enchmarks that focus on simple questions or web-only queries, DRBench evaluates agents on multi-step queries (for example, ``What changes should we make to our product roadmap to ensure compliance with this standard?") that require identifying supporting facts from both the public web and private company knowledge base. Each task is grounded in realistic user personas and enterprise context, spanning a heterogeneous search space that includes productivity software, cloud file systems, emails, chat conversations, and the open web. Tasks are generated through a carefully designed synthesis pipeline with human-in-the-loop verification, and agents are evaluated on their ability to recall relevant insights, maintain factual accuracy, and produce coherent, well-structured reports. We release 15 deep research tasks across 10 domains, such as Sales, Cybersecurity, and Compliance. We demonstrate the effectiveness of DRBench by evaluating diverse DR agents across open- and closed-source models (such as GPT, Llama, and Qwen) and DR strategies, highlighting their strengths, weaknesses, and the critical path for advancing enterprise deep research. Code is available at https://github.com/ServiceNow/drbench.

2025-09-29

ArXiv (preprint)

doi.org

arxiv.org

Generative Floor Plan Design with LLMs via Reinforcement Learning with Verifiable Rewards

Zhi Hao Luo

Ge Ya Luo

Christopher Pal

An AI system for professional floor plan design needs to be able to precisely control room dimensions and areas (quantitative constraints), … (see more)while also balancing functional considerations and design aesthetics. Existing generative approaches focus primarily on respecting the requested connectivity between rooms, but do not support generating floor plans with numerical constraints. We introduce a text‑based floor plan generation approach that fine-tunes a large language model (LLM) on real plans and then applies reinforcement learning with verifiable rewards (RLVR) to enforce both numerical (areas, dimensions) and spatial (topological) constraints. Furthermore, we design a set of constraint adherence metrics to measure how generated floor plans align with user-defined constraints systematically. Our model generates floor plans that satisfy numerical constraints and outperforms existing methods on realism, compatibility, and diversity scores. Specifically, our approach leads to an up to 94\% reduction in compatibility score. Our results demonstrate that LLMs can effectively handle quantitative constraints in structured design tasks, suggesting broader applications for text-based generative modeling.

2025-09-29

NeurIPS.cc/2025/Workshop/UrbanAI (oral)

openreview.net

GRPO-λ: Credit Assignment improves LLM Reasoning

Prasanna Parthasarathi

Mathieu Reymond

Boxing Chen

Yufei Cui

A. Chandar

Large language models (LLMs) are increasingly deployed for tasks requiring complex reasoning, prompting significant interest in improving th… (see more)eir reasoning abilities through post-training. Especially RL based methods using verifiable reward, like the state-of-the-art GRPO, have shown to tremendously improve reasoning behaviors when applied as post-training methods. However, the lack of an explicit reward or critic model limits GRPO's ability to assign fine-grained credit across token sequences. In this work, we present GRPO-

2025-09-29

ArXiv (preprint)

doi.org

arxiv.org

HVAC-SPICE: Value-Uncertainty In-Context RL with Thompson Sampling for Zero-Shot HVAC Control

Anaïs Berkes

Urban buildings consume 40\% of global energy, yet most rely on inefficient rule-based HVAC systems due to the impracticality of deploying a… (see more)dvanced controllers across diverse building stock. In-context reinforcement learning (ICRL) offers promise for rapid deployment without per-building training, but standard supervised learning objectives that maximise likelihood of training actions inherit behaviour-policy bias and provide weak exploration under the distribution shifts common when transferring across buildings and climates. We present SPICE (Sampling Policies In-Context with Ensemble uncertainty), a novel ICRL method specifically designed for zero-shot building control that addresses these fundamental limitations. SPICE introduces two key methodological innovations: (i) a propensity-corrected, return-aware training objective that prioritises high-advantage, high-uncertainty actions to enable improvement beyond suboptimal training demonstrations, and (ii) lightweight value ensembles with randomised priors that provide explicit uncertainty estimates for principled episode-level Thompson sampling. At deployment, SPICE samples one value head per episode and acts greedily, resulting in temporally coherent exploration without test-time gradients or building-specific models. We establish a comprehensive experimental protocol using the HOT dataset to evaluate SPICE across diverse building types and climate zones, focusing on the energy efficiency, occupant comfort, and zero-shot transfer capabilities that are critical for urban-scale deployment.

2025-09-29

NeurIPS.cc/2025/Workshop/UrbanAI (poster)

openreview.net

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications