Audrey Durand

Coordination between independent learning agents in a multi-agent environment is an important problem where AI systems may impact each other… (voir plus)s learning process. In this paper, we study how individual agents converge to optimal equilibrium in multi-agent where coordination is necessary to achieve optimality. Specifically, we cover the case of coordination to maximize every individual payoffs and coordination to maximize the collective payoff (cooperation). We study the emergence of such coordination behaviours in two-players matrix games with unknown payoff matrices and noisy bandit feedback. We consider five different environments along with widely used deterministic and stochastic bandit strategies. We study how different learning strategies and observation noise influence convergence to the optimal equilibrium. Our results indicate that coordination often emerge more easily from interactions between deterministic agents, especially when they follow the same learning behaviour. However, stochastic learning strategies appear to be more robust in the presence of many optimal joint actions. Overall, noisy observations often help stabilizing learning behaviours.

2025-06-23

rl-conference.cc/RLC/2025/Workshop/CoCoMARL (poster)

Optimal discounting for offline input-driven MDP

Randy Lefebvre

Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.

2025-05-09

rl-conference.cc/RLC/2025/Conference (accepté)

Optimal discounting for offline input-driven MDP

Randy Lefebvre

Offline reinforcement learning has gained a lot of popularity for its potential to solve industry challenges. However, real-world environmen… (voir plus)ts are often highly stochastic and partially observable, leading long-term planners to overfit to offline data in model-based settings. Input-driven Markov Decision Processes (IDMDPs) offer a way to work with some of the uncertainty by letting designers separate what the agent has control over (states) from what it cannot (inputs) in the environnement. These stochastic external inputs are often difficult to model. Under the assumption that the input model will be imperfect, we investigate the bias-variance tradeoff under shallow planning in IDMDPs. Paving the way to input-driven planning horizons, we also investigate the similarity of optimal planning horizons at different inputs given the structure of the input space.

2025-05-09

rl-conference.cc/RLC/2025/Conference (publié)

Platform-based Adaptive Experimental Research in Education: Lessons Learned from The Digital Learning Challenge

Ilya Musabirov

Mohi Reza

Haochen Song

Steven Moore

Pan Chen

Harsh Kumar

Tong Li

John Stamper

Norman Bier

Anna Rafferty

Thomas Price

Nina Deliu

Michael Liut

Joseph Jay Williams

: We report on our experience with a real-world, multi-experimental evaluation of an adaptive experimentation platform within the XPRIZE Dig… (voir plus)ital Learning Challenge framework. We showcase how EASI (Experiment as a Service) cross-platform software supports quick integration and deployment of adaptive experiments as well as five systematic replications within a 30-day timeframe. The outline the key scenarios of the applicability of platform-supported experiments and reflect on lessons learned from this two-year project that can help researchers and practitioners to integrate adaptive experiments in real-world courses

2025-03-03

Proceedings of the 15th International Learning Analytics and Knowledge Conference (publié)

Adaptive Experiments Under High-Dimensional and Data Sparse Settings: Applications for Educational Platforms

Haochen Song

Ilya Musabirov

Ananya Bhattacharjee

Meredith Franklin

Anna Rafferty

Joseph Jay Williams

In online educational platforms, adaptive experiment designs play a critical role in personalizing learning pathways, instructional sequenci… (voir plus)ng, and content recommendations. Traditional adaptive policies, such as Thompson Sampling, struggle with scalability in high-dimensional and sparse settings such as when there are large amount of treatments (arms) and limited resources such as funding and time to conduct to a classroom constraint student size. Furthermore, the issue of under-exploration in large-scale educational interventions can lead to suboptimal learning recommendations. To address these challenges, we build upon the concept of lenient regret, which tolerates limited suboptimal selections to enhance exploratory learning, and propose a framework for determining the feasible number of treatments given a sample size. We illustrate these ideas with a case study in online educational learnersourcing examples, where adaptive algorithms dynamically allocate peer-crafted interventions to other students under active recall exercise. Our proposed Weighted Allocation Probability Adjusted Thompson Sampling (WAPTS) algorithm enhances the efficiency of treatment allocation by adjusting sampling weights to balance exploration and exploitation in data-sparse environments. We present comparative evaluations of WAPTS across various sample sizes (N=50, 300, 1000) and treatment conditions, demonstrating its ability to mitigate under-exploration while optimizing learning outcomes.

2025-01-07

ArXiv (prépublication)

arxiv.org

Adaptive Experiments Under High-Dimensional and Data Sparse Settings: Applications for Educational Platforms

Haochen Song

Ilya Musabirov

Ananya Bhattacharjee

Meredith Franklin

Anna Rafferty

Joseph Jay Williams

2025-01-07

ArXiv (prépublication)

arxiv.org

Development of AI-assisted microscopy frameworks through realistic simulation in pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Flavie Lavoie-Cardinal

The integration of artificial intelligence into microscopy systems significantly enhances performance, optimizing both the image acquisition… (voir plus) and analysis phases. Development of artificial intelligence (AI)-assisted super-resolution microscopy is often limited by the access to large biological datasets, as well as by the difficulties to benchmark and compare approaches on heterogeneous samples. We demonstrate the benefits of a realistic STED simulation platform, pySTED, for the development and deployment of AI-strategies for super-resolution microscopy. The simulation environment provided by pySTED allows the augmentation of data for the training of deep neural networks, the development of online optimization strategies, and the training of reinforcement learning models, that can be deployed successfully on a real microscope.

2024-09-26

Nature Machine Intelligence (publié)

Development of AI-assisted microscopy frameworks through realistic simulation with pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Flavie Lavoie-Cardinal

2024-07-29

bioRxiv (prépublication)

Development of AI-assisted microscopy frameworks through realistic simulation with pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Flavie Lavoie-Cardinal

The integration of artificial intelligence (AI) into microscopy systems significantly enhances performance, optimizing both the image acquis… (voir plus)ition and analysis phases. Development of AI-assisted super-resolution microscopy is often limited by the access to large biological datasets, as well as by the difficulties to benchmark and compare approaches on heterogeneous samples. We demonstrate the benefits of a realistic STED simulation platform, pySTED, for the development and deployment of AI-strategies for super-resolution microscopy. The simulation environment provided by pySTED allows the augmentation of data for the training of deep neural networks, the development of online optimization strategies, and the training of reinforcement learning models, that can be deployed successfully on a real microscope.

2024-07-29

bioRxiv (prépublication)

Development of AI-assisted microscopy frameworks through realistic simulation with pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Flavie Lavoie-Cardinal

2024-07-29

bioRxiv (prépublication)

Development of AI-assisted microscopy frameworks through realistic simulation with pySTED

Anthony Bilodeau

Albert Michaud-Gagnon

Julia Chabbert

Benoit Turcotte

Jörn Heine

Flavie Lavoie-Cardinal

2024-07-29

bioRxiv (prépublication)

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet

Ola Ahmad

2024-07-08

Proceedings of the 41st International Conference on Machine Learning (publié)