Publications

Enhanced Multi-Class Arrhythmia Detection Using Generative Adversarial Networks for Minority Class Augmentation

Heba Ismail

Mohamed Adel Serhani

Benjamin C. M. Fung

2025-12-26

Cognitive Computation (published)

doi.org

Now is the time: operationalizing generative neurophenomenology through interpersonal methods

Anne Monnier

Lena Adel

Guillaume Dumas

Lived experience is shaped by intersubjective, social, cultural, and historical dimensions. For the past 30 years, neurophenomenology has… (see more) adopted an embodied perspective of the mind by integrating first-person experiential and third-person neurobehavioral perspectives. Indeed, the neurophenomenology pragmatic approach has embraced an embodied perspective of the mind by integrating experiential first-person and neurobehavioural third-person perspectives. Neurophenomenology reveals mutual constraints between both, as they co-constitute a person’s lived experience. This article emphasizes the intersubjective and social facets of lived experience as well as the readiness of the scientific community to use a "generative neurophenomenology" approach, envisioned in the 1990s by Francisco Varela. For this endeavour, we clarify three meanings of “generative” as it applies distinctly to generative phenomenology, generative passages, and generative models. Then, we propose to combine existing methods to update neurophenomenology program: First, by transitioning from individual to multiple people phenomenology methods that include intersubjectivity experience; second, by expanding traditional neuroscience to include measures of multimodal interpersonal synchrony; and third, by leveraging multiple computational tools to integrate different viewpoints, thereby enriching our understanding of lived experience; We also underscore the potential of diverse mathematical formalisms to capture aspects of human experience, all while underscoring that using computational approaches to model neurophenomenology does not entail endorsing computationalism as a grounding hypothesis of human experience. Finally, we illustrate the clinical relevance of this paradigm through two case studies in psychiatry—(1) with interactive dyads in autism and (2) with multiple members in family therapy sessions—demonstrating its translational potential.

2025-12-26

Neuroscience of Consciousness (published)

doi.org

SPECTRE: Spectral Pre-training Embeddings with Cylindrical Temporal Rotary Position Encoding for Fine-Grained sEMG-Based Movement Decoding

Zihan Weng

Chanlin Yi

Pouya Bashivan

Jing Lu

Fali Li

Dezhong Yao 0001

Jingming Hou

Yangsong Zhang

Peng Xu

Decoding fine-grained movement from non-invasive surface Electromyography (sEMG) is a challenge for prosthetic control due to signal non-sta… (see more)tionarity and low signal-to-noise ratios. Generic self-supervised learning (SSL) frameworks often yield suboptimal results on sEMG as they attempt to reconstruct noisy raw signals and lack the inductive bias to model the cylindrical topology of electrode arrays. To overcome these limitations, we introduce SPECTRE, a domain-specific SSL framework. SPECTRE features two primary contributions: a physiologically-grounded pre-training task and a novel positional encoding. The pre-training involves masked prediction of discrete pseudo-labels from clustered Short-Time Fourier Transform (STFT) representations, compelling the model to learn robust, physiologically relevant frequency patterns. Additionally, our Cylindrical Rotary Position Embedding (CyRoPE) factorizes embeddings along linear temporal and annular spatial dimensions, explicitly modeling the forearm sensor topology to capture muscle synergies. Evaluations on multiple datasets, including challenging data from individuals with amputation, demonstrate that SPECTRE establishes a new state-of-the-art for movement decoding, significantly outperforming both supervised baselines and generic SSL approaches. Ablation studies validate the critical roles of both spectral pre-training and CyRoPE. SPECTRE provides a robust foundation for practical myoelectric interfaces capable of handling real-world sEMG complexities.

2025-12-26

ArXiv (preprint)

doi.org

arxiv.org

Causally informed, multifactorial pathways linking cognition and personality to adolescent mental health

Jiadong Yan

Bin Wan

Paule Joanne Toussaint

Judy Chen

Gleb Bezgin

Yasser Iturria-Medina

Danilo Bzdok

Alan Evans

Sherif Karama

Adolescence is a sensitive period for the emergence of psychopathology. During this time, physiological changes and environmental exposures … (see more)jointly shape brain development and influence cognitive and personality maturation, collectively heightening vulnerability to mental disorders. However, the complexity of interactions between these factors has hindered a systems-level understanding of mental health and the causal roles of cognition and personality in psychopathology. In this study, we proposed a multifactorial causal framework integrating brain, pubertal, environmental, and behavioral factors to characterize heterogeneity in adolescent mental health trajectories at the individual level. We then investigated latent causal pathways linking cognition and personality to mental health outcomes and identified potential personalized intervention targets. Leveraging the Adolescent Brain Cognitive Development (ABCD) dataset ( N = 4,501), we analyzed 165 behavioral pairs connecting cognition and personality traits to mental health symptoms. Using cross-sectional multivariate mediation and longitudinal interaction-inclusive analyses, we identified 68 behavioral pairs showing significant causal relationships, with brain and environmental exposures contributing to most pathways, while pubertal factors exhibited limited involvement. Individualized interpretive analyses further revealed 23 pairs suggesting potential interventions with response rates exceeding 50%. Among these, behavioral inhibition, negative urgency, and processing speed emerged as the most common intervention targets, whereas psychosis symptoms and attention problems were the most likely issues to improve. Overall, our study advances a comprehensive framework capturing the multifactorial and heterogeneous nature of adolescent mental health, delineates specific causal pathways from cognitive and personality traits to psychopathology, and provides a principled basis for potential individualized intervention strategies.

2025-12-25

bioRxiv (preprint)

doi.org

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Vedant Shah

Johan Obando-Ceron

Vineet Jain

Brian Bartoldson

Bhavya Kailkhura

Nikolay Malkin

The reasoning performance of large language models (LLMs) can be substantially improved by training them with reinforcement learning (RL). T… (see more)he RL objective for LLM training involves a regularization term, which is the reverse Kullback-Leibler (KL) divergence between the trained policy and the reference policy. Since computing the KL divergence exactly is intractable, various estimators are used in practice to estimate it from on-policy samples. Despite its wide adoption, including in several open-source libraries, there is no systematic study analyzing the numerous ways of incorporating KL estimators in the objective and their effect on the downstream performance of RL-trained models. Recent works show that prevailing practices for incorporating KL regularization do not provide correct gradients for stated objectives, creating a discrepancy between the objective and its implementation. In this paper, we further analyze these practices and study the gradients of several estimators configurations, revealing how design choices shape gradient bias. We substantiate these findings with empirical observations by RL fine-tuning \texttt{Qwen2.5-7B}, \texttt{Llama-3.1-8B-Instruct} and \texttt{Qwen3-4B-Instruct-2507} with different configurations and evaluating their performance on both in- and out-of-distribution tasks. Through our analysis, we observe that, in on-policy settings: (1) estimator configurations with biased gradients can result in training instabilities; and (2) using estimator configurations resulting in unbiased gradients leads to better performance on in-domain as well as out-of-domain tasks. We also investigate the performance resulting from different KL configurations in off-policy settings and observe that KL regularization can help stabilize off-policy RL training resulting from asynchronous setups.

2025-12-25

ArXiv (preprint)

doi.org

arxiv.org

Shape of Thought: When Distribution Matters More than Correctness in Reasoning Tasks

Abhranil Chandra

Ayush Agrawal

Arian Hosseini

Sebastian Fischmeister

Rishabh Agarwal

Navin Goyal

Aaron Courville

2025-12-23

ArXiv (preprint)

doi.org

arxiv.org

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Seijin Kobayashi

Yanick Schimpf

Maximilian Schlegel

Angelika Steger

Maciej Wolczyk

Johannes Von Oswald

Nino Scherrer

Kaitlin Maile

Guillaume Lajoie

Blake Aaron Richards

Rif A. Saurous

James Manyika

Blaise Agüera y Arcas

Alexander Meulemans

João Sacramento

Large-scale autoregressive models pretrained on next-token prediction and finetuned with reinforcement learning (RL) have achieved unprecede… (see more)nted success on many problem domains. During RL, these models explore by generating new outputs, one token at a time. However, sampling actions token-by-token can result in highly inefficient learning, particularly when rewards are sparse. Here, we show that it is possible to overcome this problem by acting and exploring within the internal representations of an autoregressive model. Specifically, to discover temporally-abstract actions, we introduce a higher-order, non-causal sequence model whose outputs control the residual stream activations of a base autoregressive model. On grid world and MuJoCo-based tasks with hierarchical structure, we find that the higher-order model learns to compress long activation sequence chunks onto internal controllers. Critically, each controller executes a sequence of behaviorally meaningful actions that unfold over long timescales and are accompanied with a learned termination condition, such that composing multiple controllers over time leads to efficient exploration on novel tasks. We show that direct internal controller reinforcement, a process we term "internal RL", enables learning from sparse rewards in cases where standard RL finetuning fails. Our results demonstrate the benefits of latent action generation and reinforcement in autoregressive models, suggesting internal RL as a promising avenue for realizing hierarchical RL within foundation models.

2025-12-22

ArXiv (preprint)

doi.org

arxiv.org

Energy-Efficient Multi-LLM Reasoning for Binary-Free Zero-Day Detection in IoT Firmware

Saeid Jamshidi

Omar Abdul-Wahab

Martine Bellaiche

Foutse Khomh

Securing Internet of Things (IoT) firmware remains difficult due to proprietary binaries, stripped symbols, heterogeneous architectures, and… (see more) limited access to executable code. Existing analysis methods, such as static analysis, symbolic execution, and fuzzing, depend on binary visibility and functional emulation, making them unreliable when firmware is encrypted or inaccessible. To address this limitation, we propose a binary-free, architecture-agnostic solution that estimates the likelihood of conceptual zero-day vulnerabilities using only high-level descriptors. The approach integrates a tri-LLM reasoning architecture combining a LLaMA-based configuration interpreter, a DeepSeek-based structural abstraction analyzer, and a GPT-4o semantic fusion model. The solution also incorporates LLM computational signatures, including latency patterns, uncertainty markers, and reasoning depth indicators, as well as an energy-aware symbolic load model, to enhance interpretability and operational feasibility. In addition, we formally derive the mathematical foundations of the reasoning pipeline, establishing monotonicity, divergence, and energy-risk coupling properties that theoretically justify the model's behavior. Simulation-based evaluation reveals that high exposure conditions increase the predicted zero-day likelihood by 20 to 35 percent across models, with GPT-4o demonstrating the strongest cross-layer correlations and the highest sensitivity. Energy and divergence metrics significantly predict elevated risk (p < 0.01), reinforcing the effectiveness of the proposed reasoning framework.

2025-12-22

ArXiv (preprint)

doi.org

arxiv.org

Hidden sampling biases inflate performance in gene regulatory network inference

Marco Stock

Florin Ratajczak

Paul Bertin

Eva Hoermanseder

Yoshua Bengio

Jason Hartford

Pascal Falter-Braun

Matthias Heinig

Alexander Tong

Antonio Scialdone

Accurate reconstruction of gene regulatory networks (GRNs) from single-cell transcriptomic data remains a major methodological challenge. Re… (see more)cent machine learning approaches, particularly graph neural networks and graph autoencoders, have reported improved performance, yet these gains do not consistently translate to realistic biological settings. Here, we show that a key reason for that is the way negative regulatory interactions are sampled for supervised training and evaluation. We find that widely used sampling strategies introduce node-degree biases that allow models to exploit trivial graph-structural cues rather than biological signals. Across multiple benchmarks, simple degree-based heuristics match or exceed state-of-the-art graph neural network models under these biased evaluation protocols. We further introduce a degree-aware sampling approach that eliminates these artifacts and provides more reliable assessments of GRN inference methods. Our results call for standardized, bias-aware benchmarking practices to ensure meaningful progress in supervised GRN inference from single-cell RNA-seq data.

2025-12-22

bioRxiv (preprint)

doi.org

Mitochondria‐nucleus crosstalk characterizes Alzheimer's disease across 1,5 million brain cells

Emerging insight from stem cell research reinforces Alzheimer's disease (AD) to affect mitochondrial protein expression. Compelling new evid… (see more)ence points to mitochondrial reactive oxygen species (ROS) as potential driving player in Aβ toxicity, mediated through glial cells and ultimately impacting neuronal health. A comprehensive understanding of how oxidative phosphorylation variations relate to cell function remains largely unexplored, especially through a cell type lens. Leveraging today's largest single‐nucleus RNA sequencing dataset of AD, we unveil how cell‐type‐specific mitochondrial alterations reverberate in the nuclear transcriptome, in 424 AD patients and healthy controls from ROSMAP. By adopting a supervised latent factor modelling approach, we identified distinct gene modules capturing unique aspects of the mitochondrial crosstalk in 6 major brain cell types across 5,427 nuclear and 13 mitochondrial genes. We found that nuclear‐mitochondrial crosstalk varies distinctly with cell identity, reflecting metabolic demands and functional specialization. In neurons and oligodendrocytes, ATP synthase (complex V) takes a central role, whereas type 1 NADH dehydrogenase (complex I) is more prominent in astrocytes, microglia, and OPCs. Screening across >1 million gene expression profiles from ∼20,000 drug perturbations identified mitochondrial‐nuclear signatures that resemble those activated by parthenolide and niclosamide—two chemical compounds previously associated with oxidative stress and cytotoxicity via ubiquitination—as most predictive of AD. Microglia and OPCs achieved the highest overall classification accuracy, with stronger predictive performance observed in males than in females. Mapping gene module expressions to the Allen Human Brain Atlas revealed shared whole‐brain patterns highlighting the precuneus, which we implicated in ubiquitin‐cascade‐enriched modules. Clinical phenotyping revealed that males with higher AD risk, as indicated by their mitochondrial‐nuclear scores on glial gene modules, exhibited a greater pathological burden, including higher amyloid load, Parkinson's‐like symptoms, and neuroticism‐related traits. Finally, by comparing our findings with 2.5 million CRISPRi‐based perturbations, we identified neural signatures associated with female‐biased transcription factors and fatty acid biosynthesis, while glial signatures were linked to DNA damage and oxidative stress. By integrating multiple layers of biological data from established reference atlases, our analysis of mitochondria‐nuclear crosstalk revealed distinct transcriptional signatures associated with AD risk in glial and neural cells, with these associations exhibiting sex‐biased patterns.

2025-12-22

Alzheimer's & Dementia (published)

doi.org

Turncoat antibodies unmasked in a model of autoimmune demyelination: from biology to therapy

Reza Taghipour-Mirakmahaleh

Françoise Morin

Yu Zhang

Louis Bourhoven

Louis-Charles Béland

Qun Zhou

Julie Jaworski

Anna Park

Juan Manuel Dominguez

J. Corbeil

Eoin P Flanagan

Romain Marignier

Catherine Larochelle

Steven Kerfoot

Luc Vallières

Autoantibodies contribute to many autoimmune diseases, yet there is no approved therapy to neutralize them selectively. A popular mouse mode… (see more)l, experimental autoimmune encephalomyelitis (EAE), could serve to develop such a therapy, provided we can better understand the nature and importance of the autoantibodies involved. Here we report the discovery of autoantibody-secreting extrafollicular plasmablasts in EAE induced with specific myelin oligodendrocyte glycoprotein (MOG) antigens. Single-cell RNA sequencing reveals that these cells produce non-affinity-matured IgG antibodies. These include pathogenic antibodies competing for shared binding space on MOG’s extracellular domain. Interestingly, the synthetic anti-MOG antibody 8-18C5 can prevent the binding of pathogenic antibodies from either EAE mice or people with MOG antibody disease (MOGAD). Moreover, an 8-18C5 variant carrying the NNAS mutation, which inactivates its effector functions, can reduce EAE severity and promote functional recovery. In brief, this study provides not only a comprehensive characterization of the humoral response in EAE models, but also a proof of concept for a novel therapy to antagonize pathogenic anti-MOG antibodies.

2025-12-22

Proceedings of the National Academy of Sciences of the United States of America (published)

doi.org

Fine-Tuned In-Context Learners for Efficient Adaptation

Jörg Bornschein

Clare Lyle

Yazhe Li

Amal Rannen-Triki

Xu Owen He

Razvan Pascanu

When adapting large language models (LLMs) to a specific downstream task, two primary approaches are commonly employed: (1) prompt engineeri… (see more)ng, often with in-context few-shot learning, leveraging the model's inherent generalization abilities, and (2) fine-tuning on task-specific data, directly optimizing the model's parameters. While prompt-based methods excel in few-shot scenarios, their effectiveness often plateaus as more data becomes available. Conversely, fine-tuning scales well with data but may underperform when training examples are scarce. We investigate a unified approach that bridges these two paradigms by incorporating in-context learning directly into the fine-tuning process. Specifically, we fine-tune the model on task-specific data augmented with in-context examples, mimicking the structure of k-shot prompts. This approach, while requiring per-task fine-tuning, combines the sample efficiency of in-context learning with the performance gains of fine-tuning, leading to a method that consistently matches and often significantly exceeds both these baselines. To perform hyperparameter selection in the low-data regime, we propose to use prequential evaluation, which eliminates the need for expensive cross-validation and leverages all available data for training while simultaneously providing a robust validation signal. We conduct an extensive empirical study to determine which adaptation paradigm - fine-tuning, in-context learning, or our proposed unified approach offers the best predictive performance on a concrete data downstream-tasks.

2025-12-21

ArXiv (preprint)

doi.org

arxiv.org

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications