Publications

When to retrain a machine learning model

Florence Regol

Kyle Sprague

Thomas Markovich

A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of dat… (voir plus)a. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments, addressing classification tasks, show that the method consistently outperforms existing baselines on 7 datasets. We thoroughly assess its robustness to varying cost trade-off values and mis-specified cost trade-offs.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

openreview.net

When to retrain a machine learning model

Florence Regol

Leo Schwinn

Kyle Sprague

Mark Coates

Thomas Markovich

A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of dat… (voir plus)a. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it a poor fit for existing offline reinforcement learning methods, and the online learning formulation overlooks key practical considerations. To address this, we present a principled formulation of the retraining problem and propose an uncertainty-based method that makes decisions by continually forecasting the evolution of model performance evaluated with a bounded metric. Our experiments, addressing classification tasks, show that the method consistently outperforms existing baselines on 7 datasets. We thoroughly assess its robustness to varying cost trade-off values and mis-specified cost trade-offs.

2025-10-06

Proceedings of the 42nd International Conference on Machine Learning (publié)

proceedings.mlr.press

Combining virtual reality and hypnosis to alleviate chronic pain in elderly with hand arthritis: protocol for a randomised phase II clinical trial

Valentyn Fournier

Marie-Fania Simard

Sai Yan Yuen

Joséphine Guiné

Floriane Rousseaux

Julie Lebeau

Karim Jerbi

Philippe Richebé

Mathieu Landry

Pierre Rainville

David Ogez

2025-10-05

BMJ Open (publié)

doi.org

Combining virtual reality and hypnosis to alleviate chronic pain in elderly with hand arthritis: protocol for a randomised phase II clinical trial

Valentyn Fournier

Marie-Fania Simard

Sai Yan Yuen

Joséphine Guiné

Floriane Rousseaux

Julie Lebeau

Karim Jerbi

Philippe Richebé

Mathieu Landry

Pierre Rainville

David Ogez

Abstract Introduction Chronic pain is a common health condition that significantly impacts the quality of life of those affected, affecting … (voir plus)one in five people in Canada. The prevalence of this condition tends to increase with age, making it a major health issue given the ageing population. However, its management remains inadequate and requires significant mobilisation of healthcare professionals as well as the development of multiple therapeutic solutions. Among these, non-pharmacological interventions such as hypnosis and virtual reality have proven effective. Nevertheless, while the existing literature seems promising, it presents methodological limitations. Therefore, this study aims to assess the effectiveness of an intervention combining virtual reality and hypnosis in an ageing population suffering from a widespread chronic pain condition, that is, hand arthritis. Methods and analysis This study will be a single-centre randomised clinical trial. Participants will be randomly assigned to one of two conditions: one receiving an intervention combining virtual reality and hypnosis, and the other receiving only virtual reality. The effectiveness of the intervention on current perceived pain before and after the intervention (primary outcome) will be evaluated. Secondary outcomes will include anxiety and depressive symptoms, quality of life, relaxation and fatigue. Exploratory analyses will also be conducted to contribute to the emerging literature by examining physiological variables such as heart rate variability, respiratory rate and electrodermal activity during the intervention, and their relationship with primary and secondary outcomes. Ethics and dissemination The project was approved by the Research Ethical Committee of the Hospital Maisonneuve-Rosemont (Project no 2024-3539). Participants will be asked to provide written consent for their participation. Results from this study will be shared through peer-reviewed publications, as well as oral and poster presentations at scientific events. The protocol for this study was preregistered on Open Science Framework and raw anonymised data will be available on this platform (https://osf.io/vbh72/?view_only=1d17c5708f894faab6669d85e1fde75d). Trial registration number NCT06833905.

2025-10-05

BMJ Open (publié)

doi.org

Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation

Hadi Nekoei

Aman Jaiswal

Patrice Béchard

Oleh Shliazhko

Orlando Marquez Ayala

Mathieu Reymond

Massimo Caccia

Alexandre Drouin

Sarath Chandar

Alexandre Lacoste

2025-10-05

ArXiv (prépublication)

arxiv.org

Just-in-time Episodic Feedback Hinter: Leveraging Offline Knowledge to Improve LLM Agents Adaptation

Hadi Nekoei

Aman Jaiswal

Patrice Béchard

Oleh Shliazhko

Orlando Marquez Ayala

Mathieu Reymond

Massimo Caccia

Alexandre Drouin

Sarath Chandar

Alexandre Lacoste

Large language model (LLM) agents perform well in sequential decision-making tasks, but improving them on unfamiliar domains often requires … (voir plus)costly online interactions or fine-tuning on large expert datasets. These strategies are impractical for closed-source models and expensive for open-source ones, with risks of catastrophic forgetting. Offline trajectories offer reusable knowledge, yet demonstration-based methods struggle because raw traces are long, noisy, and tied to specific tasks. We present Just-in-time Episodic Feedback Hinter (JEF Hinter), an agentic system that distills offline traces into compact, context-aware hints. A zooming mechanism highlights decisive steps in long trajectories, capturing both strategies and pitfalls. Unlike prior methods, JEF Hinter leverages both successful and failed trajectories, extracting guidance even when only failure data is available, while supporting parallelized hint generation and benchmark-independent prompting. At inference, a retriever selects relevant hints for the current state, providing targeted guidance with transparency and traceability. Experiments on MiniWoB++, WorkArena-L1, and WebArena-Lite show that JEF Hinter consistently outperforms strong baselines, including human- and document-based hints.

2025-10-05

ArXiv (prépublication)

arxiv.org

Refactoring with LLMs: Bridging Human Expertise and Machine Understanding

Yonnel Chen Kuang Piao

Jean Carlors Paul

Leuson Da Silva

Arghavan Moradi Dakhel

Mohammad Hamdaqa

Foutse Khomh

Code refactoring is a fundamental software engineering practice aimed at improving code quality and maintainability. Despite its importance,… (voir plus) developers often neglect refactoring due to the significant time, effort, and resources it requires, as well as the lack of immediate functional rewards. Although several automated refactoring tools have been proposed, they remain limited in supporting a broad spectrum of refactoring types. In this study, we explore whether instruction strategies inspired by human best-practice guidelines can enhance the ability of Large Language Models (LLMs) to perform diverse refactoring tasks automatically. Leveraging the instruction-following and code comprehension capabilities of state-of-the-art LLMs (e.g., GPT-mini and DeepSeek-V3), we draw on Martin Fowler's refactoring guidelines to design multiple instruction strategies that encode motivations, procedural steps, and transformation objectives for 61 well-known refactoring types. We evaluate these strategies on benchmark examples and real-world code snippets from GitHub projects. Our results show that instruction designs grounded in Fowler's guidelines enable LLMs to successfully perform all benchmark refactoring types and preserve program semantics in real-world settings, an essential criterion for effective refactoring. Moreover, while descriptive instructions are more interpretable to humans, our results show that rule-based instructions often lead to better performance in specific scenarios. Interestingly, allowing models to focus on the overall goal of refactoring, rather than prescribing a fixed transformation type, can yield even greater improvements in code quality.

2025-10-04

ArXiv (prépublication)

arxiv.org

Refactoring with LLMs: Bridging Human Expertise and Machine Understanding

Yonnel Chen Kuang Piao

Jean Carlors Paul

Leuson Da Silva

Arghavan Moradi Dakhel

Mohammad Hamdaqa

Foutse Khomh

2025-10-04

ArXiv (prépublication)

arxiv.org

Capacity Planning in Stable Matching

Federico Bobbio

Margarida Carvalho

Andrea Lodi

Ignacio Rios

Alfredo Torrico

Optimizing School Seat Allocation to Improve Access and Fairness A growing shortage of public school seats in Chile has left thousands of st… (voir plus)udents unassigned each year. In their forthcoming Operations Research paper, “Capacity Planning in Stable Matching,” Bobbio et al. (2025) develop a novel framework that jointly determines where to expand school capacities and computes a student-optimal stable assignment in the enlarged market. The study develops exact and heuristic methods that make this theoretically complex problem tractable in practice. Using rich administrative data from the Chilean school choice system, the framework demonstrates how adding a limited number of seats can trigger improvement chains benefiting multiple students, also revealing diminishing marginal returns to capacity expansion. Beyond the Chilean context, the framework provides a versatile toolkit that can be adapted to other constrained allocation problems, offering a rigorous foundation for data-driven policy design in education and beyond.

2025-10-03

Operational Research (publié)

doi.org

Capacity Planning in Stable Matching

Federico Bobbio

Margarida Carvalho

Andrea Lodi

Ignacio Rios

Alfredo Torrico

2025-10-03

Operations Research (publié)

doi.org

Capacity Planning in Stable Matching

Federico Bobbio

Margarida Carvalho

Andrea Lodi

Ignacio Rios

Alfredo Torrico

We introduce the problem of jointly increasing school capacities and finding a student-optimal assignment in the expanded market. Due to the… (voir plus) impossibility of efficiently solving the problem with classical methods, we generalize existent mathematical programming formulations of stability constraints to our setting, most of which result in integer quadratically-constrained programs. In addition, we propose a novel mixed-integer linear programming formulation that is exponentially large on the problem size. We show that its stability constraints can be separated by exploiting the objective function, leading to an effective cutting-plane algorithm. We conclude the theoretical analysis of the problem by discussing some mechanism properties. On the computational side, we evaluate the performance of our approaches in a detailed study, and we find that our cutting-plane method outperforms our generalization of existing mixed-integer approaches. We also propose two heuristics that are effective for large instances of the problem. Finally, we use the Chilean school choice system data to demonstrate the impact of capacity planning under stability conditions. Our results show that each additional seat can benefit multiple students and that we can effectively target the assignment of previously unassigned students or improve the assignment of several students through improvement chains. These insights empower the decision-maker in tuning the matching algorithm to provide a fair application-oriented solution.

2025-10-03

Operations Research (publié)

doi.org

arxiv.org

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

Léo Boisvert

Abhay Puri

Chandra Kiran Reddy Evuru

Nicolas Chapados

Quentin Cappart

Alexandre Lacoste

Krishnamurthy Dj Dvijotham

Alexandre Drouin

The practice of fine-tuning AI agents on data from their own interactions--such as web browsing or tool use--, while being a strong general … (voir plus)recipe for improving agentic capabilities, also introduces a critical security vulnerability within the AI supply chain. In this work, we show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors that are triggerred by specific target phrases, such that when the agent encounters these triggers, it performs an unsafe or malicious action. We formalize and validate three realistic threat models targeting different layers of the supply chain: 1) direct poisoning of fine-tuning data, where an attacker controls a fraction of the training traces; 2) environmental poisoning, where malicious instructions are injected into webpages scraped or tools called while creating training data; and 3) supply chain poisoning, where a pre-backdoored base model is fine-tuned on clean data to improve its agentic capabilities. Our results are stark: by poisoning as few as 2% of the collected traces, an attacker can embed a backdoor causing an agent to leak confidential user information with over 80% success when a specific trigger is present. This vulnerability holds across all three threat models. Furthermore, we demonstrate that prominent safeguards, including two guardrail models and one weight-based defense, fail to detect or prevent the malicious behavior. These findings highlight an urgent threat to agentic AI development and underscore the critical need for rigorous security vetting of data collection processes and end-to-end model supply chains.

2025-10-03

ArXiv (prépublication)

arxiv.org

Programme d’apprentissage IA sur mesure

Mil'Haq Fest 2025

Communauté de pratique de Mila

Demandes de supervision

Publications

Programme d’apprentissage IA sur mesure

Mil'Haq Fest 2025

Communauté de pratique de Mila

Demandes de supervision

Mots-clés populaires:

Publications