Publications

Mapping of Subjective Accounts into Interpreted Clusters (MOSAIC): Topic Modelling and LLM applied to Stroboscopic Phenomenology
Romy Beauté
David J. Schwartzman
Jennifer Crook
Fiona Macpherson
Adam B. Barrett
Anil K. Seth
Abstract Stroboscopic light stimulation (SLS) on closed eyes typically induces simple visual hallucinations, characterized by vivid, geometr… (voir plus)ic, and colourful patterns. A dataset of 898 sentences, extracted from 407 open subjective reports, was recently compiled as part of the Dreamachine programme (https://dreamachine.world/) (Collective Act, 2022), an immersive multisensory experience that combines SLS and spatial sound in a collective setting. Although open reports extend the range of reportable phenomenology, their analysis presents significant challenges, particularly in systematically identifying patterns. To address this challenge, we implemented a data-driven approach leveraging large language models and topic modelling to uncover and interpret latent experiential topics directly from the Dreamachine’s text-based reports. Our analysis confirmed the presence of simple visual hallucinations typically documented in scientific studies of SLS, while also revealing experiences of altered states of consciousness and complex hallucinations. Building on these findings, our computational approach expands the systematic study of subjective experience by enabling data-driven analyses of open-ended phenomenological reports, capturing experiences not readily identified through standard questionnaires. By revealing rich and multifaceted aspects of experiences, our study broadens our understanding of stroboscopically induced phenomena while highlighting the potential of natural language processing and large language models in the field of computational phenomenology. More generally, this approach provides a practically applicable methodology for uncovering subtle hidden patterns of subjective experience across diverse research domains. Open-source implementation and an interactive web application are provided to facilitate application of this methodology.
MedRiskEval: Medical Risk Evaluation Benchmark of Language Models, On the Importance of User Perspectives in Healthcare Settings
Jean-Philippe Corbeil
Minseon Kim
Maxime Griot
Sheela Agarwal
Francois Beaulieu
Paul Vozila
Jean-Philippe Corbeil, Minseon Kim, Maxime Griot, Sheela Agarwal, Alessandro Sordoni, Francois Beaulieu, Paul Vozila. Proceedings of the 19t… (voir plus)h Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track). 2026.
Nonlinear Observer Design for Visual-Inertial Odometry
Mouaad Boughellaba
Abdelhamid Tayebi
Soulaimane Berkane
This paper addresses the problem of Visual-Inertial Odometry (VIO) for rigid body systems evolving in three-dimensional space. We introduce … (voir plus)a novel matrix Lie group structure, denoted SE_{3+n}(3), that unifies the pose, gravity, linear velocity, and landmark positions within a consistent geometric framework tailored to the VIO problem. Building upon this formulation, we design an almost globally asymptotically stable nonlinear geometric observer that tightly integrates data from an Inertial Measurement Unit (IMU) and visual sensors. Unlike conventional Extended Kalman Filter (EKF)-based estimators that rely on local linearization and thus ensure only local convergence, the proposed observer achieves almost global stability through the decoupling of the rotational and translational dynamics. A globally exponentially stable Riccati-based translational observer along with an almost global input-to-state stable attitude observer are designed such that the overall cascaded observer enjoys almost global asymptotic stability. This cascaded architecture guarantees robust and consistent estimation of the extended state, including orientation, position, velocity, gravity, and landmark positions, up to the VIO unobservable directions (i.e., a global translation and rotation about gravity). The effectiveness of the proposed scheme is demonstrated through numerical simulations as well as experimental validation on the EuRoC MAV dataset, highlighting its robustness and suitability for real-world VIO applications.
Online HD-tRNS over the Right Temporoparietal Junction Enhances Mentalizing during Social Interactions
Vincent Chamberland
Quentin Moreau
Lisane Moses
Gabriela Milanova
Optimizing User Profiles via Contextual Bandits for Retrieval-Augmented LLM Personalization
Zichen Zhao
Fuyuan Lyu
Xiuying Chen
Jikun Kang
Xue Liu
Large Language Models (LLMs) excel at general-purpose tasks, yet adapting their responses to individual users remains challenging. Retrieval… (voir plus) augmentation provides a lightweight alternative to fine-tuning by conditioning LLMs on user history records, and existing approaches typically select these records based on semantic relevance. We argue that relevance serves as an unreliable proxy for utility: a record may be semantically similar to a query yet fail to improve generation quality or even degrade it due to redundancy or conflicting information. To bridge this gap, we propose PURPLE, a contextual bandit framework that oPtimizes UseR Profiles for Llm pErsonalization. In contrast to a greedy selection of the most relevant records, PURPLE treats profile construction as a set generation process and utilizes a Plackett-Luce ranking model to capture complex inter-record dependencies. By training with dense feedback provided by the likelihood of the reference response, our method aligns retrieval directly with generation quality. Extensive experiments on nine personalization tasks demonstrate that PURPLE consistently outperforms strong heuristic and retrieval-augmented baselines in both effectiveness and efficiency, establishing a principled and scalable solution for optimizing user profiles.
PAC-X: Fuzzy Explainable AI for Multi-Class Malware Detection
Mohd Saqib
Benjamin C. M. Fung
Philippe Charland
PheCode-guided multi-modal topic modeling of electronic health records improves disease incidence prediction and GWAS discovery from UK Biobank
Ziqi Yang
Ziyang Song
Phenome-wide association studies rely on disease definitions derived from diagnostic codes, often failing to leverage the full richness of e… (voir plus)lectronic health records (EHR). We present MixEHR-SAGE, a PheCode-guided multi-modal topic model that integrates diagnoses, procedures, and medications to enhance phenotyping from large-scale EHRs. By combining expert-informed priors with probabilistic inference, MixEHR-SAGE identifies over 1000 interpretable phenotype topics from UK Biobank data. Applied to 350 000 individuals with high-quality genetic data, MixEHR-SAGE-derived risk scores accurately predict incident type 2 diabetes (T2D) and leukemia diagnoses. Subsequent genome-wide association studies using these continuous risk scores uncovered novel disease-associated loci, including PPP1R15A for T2D and JMJD6/SRSF2 for leukemia, that were missed by traditional binary case definitions. These results highlight the potential of probabilistic phenotyping from multi-modal EHRs to improve genetic discovery. The MixEHR-SAGE software is publicly available at: https://github.com/li-lab-mcgill/MixEHR-SAGE.
Piezoelectric tuning of thermal conductivity in nano-architected gallium nitride metamaterials
Jun Cai
Alireza Seyedkanani
Benyamin Shahryari
Abdolhamid Akbarzadeh
Practical Solutions to Volt-var Optimization under Uncertainty via Blackbox Optimization
In this work, we propose an optimal reactive power dispatch (ORPD) stochastic program for volt-var optimization (VVO) of power distribution … (voir plus)networks. The formulation considers not only control settings of conventional VVO devices, e.g., voltage regulators, capacitor banks, and on-load tap changers, but also optimal settings for volt-var droop curves of distributed energy resources (DERs), compliant with the IEEE 1547-2018 standard. Instead of including the power flow equations in the optimization problem which makes it nonlinear and nonconvex, a power flow solver is utilized and the problem is solved by blackbox optimization (BBO). The feasibility of the derived solution is improved by using unbalanced power flow simulations. The solution is effective under various demand and DER generation scenarios such that device settings are not frequently changed, making it practical for in-field implementations. Through numerical simulations on IEEE test feeders, we illustrate the performance of the solutions of our proposed approach on both in-sample and out-of-sample scenarios. We show that our approach outperforms a benchmark reinforcement learning method, and is also scalable to large-scale distribution networks.
Press Start to Charge: Videogaming the Online Centralized Charging Scheduling Problem
Alireza Ghahtarani
Martin Cousineau
Jorge E. Mendoza
Property-Driven Protein Inverse Folding with Multi-Objective Preference Alignment
Junqi Liu
Xiaoyang Hou
Xin Liu
Zhi Yang
Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, dev… (voir plus)elopability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc mutation, inference-time biasing, or retraining on property-specific subsets, yet they are target dependent and demand substantial domain expertise or careful hyperparameter tuning. In this paper, we introduce ProtAlign, a multi-objective preference alignment framework that fine-tunes pretrained inverse folding models to satisfy diverse developability objectives while preserving structural fidelity. ProtAlign employs a semi-online Direct Preference Optimization strategy with a flexible preference margin to mitigate conflicts among competing objectives and constructs preference pairs using in silico property predictors. Applied to the widely used ProteinMPNN backbone, the resulting model MoMPNN enhances developability without compromising designability across tasks including sequence design for CATH 4.3 crystal structures, de novo generated backbones, and real-world binder design scenarios, making it an appealing framework for practical protein sequence design.
Quantifying LLM Attention-Head Stability: Implications for Circuit Universality.
In mechanistic interpretability, recent work scrutinizes transformer"circuits"- sparse, mono or multi layer sub computations, that may refle… (voir plus)ct human understandable functions. Yet, these network circuits are rarely acid-tested for their stability across different instances of the same deep learning architecture. Without this, it remains unclear whether reported circuits emerge universally across labs or turn out to be idiosyncratic to a particular estimation instance, potentially limiting confidence in safety-critical settings. Here, we systematically study stability across-refits in increasingly complex transformer language models of various sizes. We quantify, layer by layer, how similarly attention heads learn representations across independently initialized training runs. Our rigorous experiments show that (1) middle-layer heads are the least stable yet the most representationally distinct; (2) deeper models exhibit stronger mid-depth divergence; (3) unstable heads in deeper layers become more functionally important than their peers from the same layer; (4) applying weight decay optimization substantially improves attention-head stability across random model initializations; and (5) the residual stream is comparatively stable. Our findings establish the cross-instance robustness of circuits as an essential yet underappreciated prerequisite for scalable oversight, drawing contours around possible white-box monitorability of AI systems.