Portrait de Audrey Durand

Audrey Durand

Membre académique associé
Chaire en IA Canada-CIFAR
Professeure adjointe, Université Laval, Département d'informatique et de génie logiciel
Sujets de recherche
Apprentissage en ligne
Apprentissage par renforcement
IA pour la science

Biographie

Audrey Durand est professeure adjointe au Département d’informatique et de génie logiciel ainsi qu’au Département de génie électrique et de génie informatique de l’Université Laval. Elle se spécialise dans les algorithmes qui apprennent par l’interaction avec leur environnement, soit l’apprentissage par renforcement, et s’intéresse particulièrement à l’application de ces approches au domaine de la santé.

Étudiants actuels

Maîtrise recherche - Université Laval
Maîtrise recherche - Université Laval
Maîtrise recherche - UdeM
Superviseur⋅e principal⋅e :
Doctorat - Université Laval
Maîtrise recherche - Université Laval
Doctorat - Université Laval
Doctorat - Université Laval
Doctorat - Université Laval
Postdoctorat - Université Laval

Publications

Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling
Fine-tuning pretrained models is a standard and effective workflow in modern machine learning. However, robust fine-tuning (RFT), which aims… (voir plus) to simultaneously achieve adaptation to a downstream task and robustness to adversarial examples, remains challenging. Despite the abundance of non-robust pretrained models in open-source repositories, their potential for RFT is less understood. We address this knowledge gap by systematically examining RFT from such non-robust models. Our experiments reveal that fine-tuning non-robust models with a robust objective, even under small perturbations, can lead to poor performance, a phenomenon that we dub _suboptimal transfer_. In challenging scenarios (eg, difficult tasks, high perturbation), the resulting performance can be so low that it may be considered a transfer failure. We find that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer. However, we propose a novel heuristic, _Epsilon-Scheduling_, a schedule over perturbation strength used during training that promotes optimal transfer. Additionally, we introduce _expected robustness_, a metric that captures performance across a range of perturbations, providing a more comprehensive evaluation of the accuracy-robustness trade-off of diverse models at test-time. Extensive experiments on wide range of configurations (six pretrained models and five datasets) show that _Epsilon-Scheduling_ successfully prevents _suboptimal transfer_ and consistently improves expected robustness.
Use of an Integrated Knowledge Translation Approach to Develop an Electronic Patient-Reported Outcome System for Cancer Rehabilitation: Tutorial
Christian Lopez
Sarah E Neil-Sztramko
Kristin L Campbell
David M Langelier
Tran Truong
Yuliya Gavrylyuk
Pia Nyakairu
Laura Parente
Jackie L Bender
Gillian Strudwick
Jonathan Greenland
Tony Reiman
Jennifer M Jones
Electronic prospective surveillance models (ePSMs) have the potential to improve the management of cancer-related impairments by systematica… (voir plus)lly screening patients using electronic patient-reported outcomes during and after treatment, and linking them to tailored self-management resources and rehabilitation programs. However, their successful implementation into routine care requires careful consideration of patient and provider needs and must align with clinical workflows, which may vary across settings and require adaptation to the local context. The aim of this paper is to describe the development of REACH, a web-based ePSM designed to remotely screen for physical cancer–related impairments and direct patients to rehabilitation resources based on need. The development of REACH followed an integrated knowledge translation (iKT) approach, engaging key knowledge users including patients, clinicians, administrators, and information technology specialists. The development process involved collaboration across 5 working groups. The system content and logic group selected the impairments to be screened, measures used, frequency of screening, and resources recommended based on results of a survey with oncology providers and researchers, patient feedback, a literature review, and an environmental scan. The machine learning group explored predictive modeling approaches to optimize the assessment frequency using retrospective patient data. The implementation group identified features from existing systems that could be built to promote assessment completion and integration into clinical workflows through a scoping review, interviews with clinic staff, and focus groups with patients. The design group conducted co-design workshops and usability testing with patients to iteratively refine the interface and develop a prototype. Finally, the software development group converted the prototype to a web-based application and conducted privacy and security assessments and quality assurance. The integration of key knowledge users through an iKT approach played a critical role in determining the design and functionality of REACH. REACH allows patients to remotely complete assessments tailored to their cancer type and treatment status on any electronic device. The system generates automated advice based on the assessment responses, including links to educational resources for self-management, suggestions for community programs to register for, and recommendations to contact their oncology team for further assessment and possible referral to rehabilitation services. These recommended resources are stored in the patient’s personalized library, organized by type and severity of cancer-related impairments reported, and are updated following each new electronic patient-reported outcomes assessment completed. Additional key system features include a patient-driven and structured process for managing high impairment scores, usability enhancements to improve navigation, and safeguards to ensure data security. The development of REACH demonstrates how an iKT approach can be used to design an ePSM that is user-friendly, clinically relevant, and aligned with implementation considerations. The system has been implemented at 4 Canadian cancer centers, and its implementation is being evaluated to inform future refinements.
LLM-as-a-Judge: Toward World Models for Slate Recommendation Systems
Modeling user preferences across domains remains a key challenge in slate recommendation (i.e. recommending an ordered sequence of items) re… (voir plus)search. We investigate how Large Language Models (LLM) can effectively act as world models of user preferences through pairwise reasoning over slates. We conduct an empirical study involving several LLMs on three tasks spanning different datasets. Our results reveal relationships between task performance and properties of the preference function captured by LLMs, hinting towards areas for improvement and highlighting the potential of LLMs as world models in recommender systems.
Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
GWSkyNet-Multi. II. An Updated Machine Learning Model for Rapid Classification of Gravitational-wave Events
Nayyer Raza
Man Leong Chan
Daryl Haggard
Ashish Mahabal
Jess McIver
Multimessenger observations of gravitational waves and electromagnetic emission from compact object mergers offer unique insights into the s… (voir plus)tructure of neutron stars, the formation of heavy elements, and the expansion rate of the Universe. With the LIGO–Virgo–KAGRA (LVK) gravitational-wave detectors currently in their fourth observing run (O4), it is an exciting time for detecting these mergers. However, assessing whether to follow up a candidate gravitational-wave event given limited telescope time and resources is challenging; the candidate can be a false alert due to detector glitches, or may not have any detectable electromagnetic counterpart even if it is real. GWSkyNet-Multi is a machine learning model developed to facilitate follow-up decisions by providing real-time classification of candidate events, using localization information released in LVK rapid public alerts. Here we introduce GWSkyNet-Multi II, an updated model targeted toward providing more robust and informative predictions during O4 and beyond. Specifically, the model now provides normalized probability scores and associated uncertainties for each of the four corresponding source categories released by the LVK: glitch, binary black hole, neutron star–black hole, and binary neutron star. Informed by explainability studies of the original model, the updated model architecture is also significantly simplified, including replacing input images with intuitive summary values that are more interpretable. For significant event alerts issued during O4a and O4b, GWSkyNet-Multi II produces a prediction that is consistent with the updated LVK classification for 93% of events. The updated model can be used by the community to help make time-critical follow-up decisions.
Predicting space use patterns of a territorial top predator: from individual movement decisions to Arctic fox space use
Frédéric Dulude-de Broin
Dominique Berteaux
Joël Bêty
Alexis Grenier-Potvin
Andréanne Beardsell
Jeanne Clermont
Pierre Legagneux
A Guide to Robust Generalization: The Impact of Architecture, Pre-training, and Optimization Strategy
Deep learning models operating in the image domain are vulnerable to small input perturbations. For years, robustness to such perturbations … (voir plus)was pursued by training models from scratch (i.e., with random initializations) using specialized loss objectives. Recently, robust fine-tuning has emerged as a more efficient alternative: instead of training from scratch, pretrained models are adapted to maximize predictive performance and robustness. To conduct robust fine-tuning, practitioners design an optimization strategy that includes the model update protocol (e.g., full or partial) and the specialized loss objective. Additional design choices include the architecture type and size, and the pretrained representation. These design choices affect robust generalization, which is the model's ability to maintain performance when exposed to new and unseen perturbations at test time. Understanding how these design choices influence generalization remains an open question with significant practical implications. In response, we present an empirical study spanning 6 datasets, 40 pretrained architectures, 2 specialized losses, and 3 adaptation protocols, yielding 1,440 training configurations and 7,200 robustness measurements across five perturbation types. To our knowledge, this is the most diverse and comprehensive benchmark of robust fine-tuning to date. While attention-based architectures and robust pretrained representations are increasingly popular, we find that convolutional neural networks pretrained in a supervised manner on large datasets often perform best. Our analysis both confirms and challenges prior design assumptions, highlighting promising research directions and offering practical guidance.
Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Adversarial Scheduling
Yann Batiste Pequignot
Ola Ahmad
Frédéric Precioso
Fine-tuning pretrained models is a standard and effective workflow in modern machine learning. However, robust fine-tuning (RFT), which aims… (voir plus) to simultaneously achieve adaptation to a downstream task and robustness to adversarial examples, remains challenging. Despite the abundance of non-robust pretrained models in open-source repositories, their potential for RFT is less understood. We address this knowledge gap by systematically examining RFT from such non-robust models. Our experiments reveal that fine-tuning non-robust models with a robust objective, even under small perturbations, can lead to poor performance, a phenomenon that we dub \emph{suboptimal transfer}. In challenging scenarios (eg, difficult tasks, high perturbation), the resulting performance can be so low that it may be considered a transfer failure. We find that fine-tuning using a robust objective impedes task adaptation at the beginning of training and eventually prevents optimal transfer. However, we propose a novel heuristic, \emph{Epsilon-Scheduling}, a schedule over perturbation strength used during training that promotes optimal transfer. Additionally, we introduce \emph{expected robustness}, a metric that captures performance across a range of perturbations, providing a more comprehensive evaluation of the accuracy-robustness trade-off for diverse models at test time. Extensive experiments on a wide range of configurations (six pretrained models and five datasets) show that \emph{Epsilon-Scheduling} successfully prevents \emph{suboptimal transfer} and consistently improves expected robustness.
On the Fundamental Limitations of Dual Static CVaR Decompositions in Markov Decision Processes
Multi-Agent Matrix Games with Individual learners: How Exploration-Exploitation Strategies Impact the Emergence of Coordination
Coordination between independent learning agents in a multi-agent environment is an important problem where AI systems may impact each other… (voir plus)s learning process. In this paper, we study how individual agents converge to optimal equilibrium in multi-agent where coordination is necessary to achieve optimality. Specifically, we cover the case of coordination to maximize every individual payoffs and coordination to maximize the collective payoff (cooperation). We study the emergence of such coordination behaviours in two-players matrix games with unknown payoff matrices and noisy bandit feedback. We consider five different environments along with widely used deterministic and stochastic bandit strategies. We study how different learning strategies and observation noise influence convergence to the optimal equilibrium. Our results indicate that coordination often emerge more easily from interactions between deterministic agents, especially when they follow the same learning behaviour. However, stochastic learning strategies appear to be more robust in the presence of many optimal joint actions. Overall, noisy observations often help stabilizing learning behaviours.
Human-AI Alignment of Learning Trajectories in Video Games: a continual RL benchmark proposal
Yann Harel
Lune P Bellec
We propose a design for a continual reinforcement learning (CRL) benchmark called GHAIA, centered on human-AI alignment of learning trajecto… (voir plus)ries in structured video game environments. Using \textit{Super Mario Bros.} as a case study, gameplay is decomposed into short, annotated scenes organized into diverse task sequences based on gameplay patterns and difficulty. Evaluation protocols measure both plasticity and stability, with flexible revisit and pacing schedules. A key innovation is the inclusion of high-resolution human gameplay data collected under controlled conditions, enabling direct comparison of human and agent learning. In addition to adapting classical CRL metrics like forgetting and backward transfer, we introduce semantic transfer metrics capturing learning over groups of scenes sharing similar game patterns. We demonstrate the feasibility of our approach on human and agent data, and discuss key aspects of the first release for community input.
Optimal discounting for offline input-driven MDP
Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.