Publications

Learning from Pairwise Preferences in Long-Term Decision Problems

Jonathan Colaço Carr

Prakash Panangaden

Doina Precup

Benjamin Van Roy

Agents that can beat or tie any other under a model of pairwise preference have strong guarantees for both user satisfaction and overall soc… (see more)ial welfare. However, searching for these agents in long-term decision problems is not computationally tractable with current approaches, which require the size of an agent's policy to increase with the problem length. We introduce the \textit{Markov decision contest}, a model of learning from general preferences in long-term (infinite-horizon) decision problems. Within this model, we prove that agents only need a stationary Markov policy in order to be optimal (that is, to beat or tie any agent with a history-dependent policy); that the problem of finding an optimal policy is in P; and that a simple iterative algorithm (which we call Hedged Policy Iteration) converges to an optimal policy at a sublinear rate. In a suite of high-dimensional experiments, we demonstrate that Hedged Policy Iteration scales well to function approximation. Lastly, we present a near approximation of Hedged Policy Iteration, called HPI-Clip, which both matches the performance of Proximal Policy Optimization on reward-based tasks while also outperforming it on tasks with non-transitive preferences. These results show that learning from pairwise preferences in long-term decision problems can be far more tractable than what is known from prior work.

2025-12-31

International Conference on Machine Learning (Accept (regular))

Leveraging Diversity for Privileged Multi-Teacher Knowledge Distillation for Facial Expression Recognition

Muhammad Haseeb Aslam

Marco Pedersoli

Alessandro L. Koerich

Eric Granger

2025-12-31

SSRN Electronic Journal (accepted)

LogicXGNN: Grounded Logical Rules for Explaining Graph Neural Networks

Chuqin Geng

Ziyu Zhao

Zhaoyue Wang

Haolin Ye

Yuhe Jiang

Xujie Si

Existing rule-based explanations for Graph Neural Networks (GNNs) provide global interpretability but often optimize and assess fidelity in … (see more)an intermediate, uninterpretable concept space, overlooking the grounding quality of the final subgraph explanations for end users. This gap yields explanations that may appear faithful yet be unreliable in practice. To this end, we propose LogicXGNN, a post hoc framework that constructs logical rules over reliable predicates explicitly designed to capture the GNN's message-passing structure, thereby ensuring effective grounding. We further introduce data-grounded fidelity (

2025-12-31

International Conference on Learning Representations (Accept (Poster))

Do machine learning methods make better predictions than conventional ones in pharmacoepidemiology? A systematic review, meta-analysis, and network meta-analysis.

Ana Paula Bruno Pena-Gralle

Mireille E. Schnitzer

Sofia-Nada Boureguaa

Félix Morin

Marc-André Legault

Caroline Sirois

Alice Dragomir

Lucie Blais

2025-12-31

Artif. Intell. Medicine (published)

MANSION: Multi-floor lANguage-to-3D Scene generatIOn for loNg-horizon tasks

Lirong Che

Shuo Wen

Shan Huang

Chuang Wang

Yuzhe Yang

Gregory Dudek

Xueqian Wang

Jian Su

Real-world robotic tasks are long-horizon and often span multiple floors, requiring complex spatial reasoning. Existing embodied benchmarks,… (see more) however, are largely confined to single-floor homes, failing to evaluate agents on realistic, building-scale tasks. We introduce MANSION, a language-driven framework for generating building-scale, multi-floor 3D environments for long-horizon tasks. Using this framework, we release MansionWorld, a large-scale dataset featuring over 1,000 diverse, non-residential buildings. These environments support cross-floor skills and long-horizon task generation on reusable building layouts. Experiments show that current methods degrade sharply on our multi-floor tasks, highlighting both the challenge and the value of this setting for advancing embodied AI.

2025-12-31

IEEE/CVF Conference on Computer Vision and Pattern Recognition (Accept (Poster))

Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology

Romy Beauté

David J. Schwartzman

Guillaume Dumas

Jennifer Crook

Fiona Macpherson

Adam B. Barrett

Anil K. Seth

Abstract Stroboscopic light stimulation (SLS) on closed eyes typically induces simple visual hallucinations, characterized by vivid, geometr… (see more)ic, and colourful patterns. A dataset of 898 sentences, extracted from 407 open subjective reports, was recently compiled as part of the Dreamachine programme (https://dreamachine.world/) (Collective Act, 2022), an immersive multisensory experience that combines SLS and spatial sound in a collective setting. Although open reports extend the range of reportable phenomenology, their analysis presents significant challenges, particularly in systematically identifying patterns. To address this challenge, we implemented a data-driven approach leveraging large language models and topic modelling to uncover and interpret latent experiential topics directly from the Dreamachine’s text-based reports. Our analysis confirmed the presence of simple visual hallucinations typically documented in scientific studies of SLS, while also revealing experiences of altered states of consciousness and complex hallucinations. Building on these findings, our computational approach expands the systematic study of subjective experience by enabling data-driven analyses of open-ended phenomenological reports, capturing experiences not readily identified through standard questionnaires. By revealing rich and multifaceted aspects of experiences, our study broadens our understanding of stroboscopically induced phenomena while highlighting the potential of natural language processing and large language models in the field of computational phenomenology. More generally, this approach provides a practically applicable methodology for uncovering subtle hidden patterns of subjective experience across diverse research domains. Open-source implementation and an interactive web application are provided to facilitate application of this methodology.

2025-12-31

Neuroscience of Consciousness (published)

MedRiskEval: Medical Risk Evaluation Benchmark of Language Models, On the Importance of User Perspectives in Healthcare Settings

Jean-Philippe Corbeil

Minseon Kim

Maxime Griot

Sheela Agarwal

Alessandro Sordoni

Francois Beaulieu

Paul Vozila

Jean-Philippe Corbeil, Minseon Kim, Maxime Griot, Sheela Agarwal, Alessandro Sordoni, Francois Beaulieu, Paul Vozila. Proceedings of the 19t… (see more)h Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track). 2026.

2025-12-31

Conference of the European Chapter of the Association for Computational Linguistics (published)

Laurence Perreault-Levasseur

MIRA: A Score for Conditional Distribution Accuracy and Model Comparison

Sammy Sharief

Justine Zeghal

Gabriel Missael Barco

Pablo Lemos

Yashar Hezaveh

We present Mira, a method for estimating the expected probability that samples from a candidate conditional distribution match the true, unk… (see more)nown conditional distribution, for which only data-label pairs are available. We derive theoretical bounds obtained when the candidate distribution matches the true one and when the conditional distributions are independent. This framework thus enables model comparison by quantifying the alignment between the conditional distribution of a candidate model and the data-label pairs of the true model. Consequently, Mira enables Bayesian model comparison through direct posterior validation, bypassing the challenging evidence computation. We demonstrate its effectiveness across several toy problems and Bayesian inference tasks.

2025-12-31

International Conference on Machine Learning (Accept (spotlight))

Nonlinear Observer Design for Visual-Inertial Odometry

Mouaad Boughellaba

Abdelhamid Tayebi

James Richard Forbes

Soulaimane Berkane

This paper addresses the problem of Visual-Inertial Odometry (VIO) for rigid body systems evolving in three-dimensional space. We introduce … (see more)a novel matrix Lie group structure, denoted SE_{3+n}(3), that unifies the pose, gravity, linear velocity, and landmark positions within a consistent geometric framework tailored to the VIO problem. Building upon this formulation, we design an almost globally asymptotically stable nonlinear geometric observer that tightly integrates data from an Inertial Measurement Unit (IMU) and visual sensors. Unlike conventional Extended Kalman Filter (EKF)-based estimators that rely on local linearization and thus ensure only local convergence, the proposed observer achieves almost global stability through the decoupling of the rotational and translational dynamics. A globally exponentially stable Riccati-based translational observer along with an almost global input-to-state stable attitude observer are designed such that the overall cascaded observer enjoys almost global asymptotic stability. This cascaded architecture guarantees robust and consistent estimation of the extended state, including orientation, position, velocity, gravity, and landmark positions, up to the VIO unobservable directions (i.e., a global translation and rotation about gravity). The effectiveness of the proposed scheme is demonstrated through numerical simulations as well as experimental validation on the EuRoC MAV dataset, highlighting its robustness and suitability for real-world VIO applications.

2025-12-31

arXiv (preprint)

Online HD-tRNS over the Right Temporoparietal Junction Enhances Mentalizing during Social Interactions

Vincent Chamberland

Quentin Moreau

Lisane Moses

Gabriela Milanova

Guillaume Dumas

2025-12-31

Neuromodulation: Technology at the Neural Interface (published)

Operationalizing the Superficial Alignment Hypothesis via Task Complexity

Tiago Pimentel

The superficial alignment hypothesis (SAH) posits that large language models learn most of their knowledge during pre-training, and that pos… (see more)t-training merely surfaces this knowledge. The SAH, however, lacks a precise definition, which has led to (i) different and seemingly orthogonal arguments supporting it, and (ii) important critiques to it. We propose a new metric called **Task Complexity**: the length of the shortest program that achieves a target performance on a task. In this framework, the SAH claims that pre-trained models drastically reduce the task complexity of achieving high performance on many tasks. Our definition unifies prior arguments supporting the SAH, interpreting them as different strategies to find such short programs. Experimentally, we estimate task complexities of mathematical reasoning, machine translation, and instruction following tasks and show that their respective task complexities can be remarkably low when conditioned on a pre-trained model. Further, we find that pre-training enables access to strong performances on our tasks, but it can require programs of gigabytes of length to access them. Post-training, on the other hand, collapses the complexity of reaching this same performance by several orders of magnitude. Overall, our results highlight that task adaptation can require remarkably little information—often just a few kilobytes.

2025-12-31

International Conference on Machine Learning (Accept (regular))