Publications

Health data issues in Africa: time for digitization, standardization and harmonization

Abdoelnaser Degoot

Ismaël Koné

Shakuntala Baichoo

Mercy Ngungu

Nzisa Liku

Judit Kumuthini

Joyce Nakatumba‐Nabende

Foutse Khomh

Bubacarr Bah

This commentary discusses health data challenges in Africa, focusing on digitization, standardization, and harmonization as key solutions. I… (see more)t highlights how addressing these foundational issues can enable AI and data science to transform healthcare systems across the continent.

2025-06-30

Nature Communications (published)

doi.org

How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

Dharshan Kumaran

Stephen M Fleming

Larisa Markeeva

Joseph Heyward

Andrea Banino

Mrinal Mathur

Razvan Pascanu

Simon Kayode Osindero

Benedetto De Martino

Petar Veličković

Viorica Patraucean

Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers wh… (see more)ilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two mechanisms -- a drive to maintain consistency with prior commitments and hypersensitivity to contradictory feedback -- parsimoniously capture LLM behavior in a different domain. Together, these findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and excessive sensitivity to criticism.

2025-06-30

arXiv (published)

doi.org

arxiv.org

HVAC-GRACE: Transferable Building Control via Heterogeneous Graph Neural Network Policies

Anaïs Berkes

Donna Vakalis

David Rolnick

Yoshua Bengio

Buildings consume 40% of global energy, with HVAC systems responsible for up to half of that demand. As energy use grows, optimizing HVAC ef… (see more)ficiency is critical to meeting climate goals. While reinforcement learning (RL) offers a promising alternative to rule-based control, real-world adoption is limited by poor sample efficiency and generalisation. We introduce HVAC-GRACE, a graph-based RL framework that models buildings as heterogeneous graphs and integrates spatial message passing directly into temporal GRU gates. This enables each zone to learn control actions informed by both its own history and its structural context. Our architecture supports zero-shot transfer by learning topology-agnostic functions—but initial experiments reveal that this benefit depends on sufficient conditioned zone connectivity to maintain gradient flow. These findings highlight both the promise and the architectural requirements of scalable, transferable RL for building control

2025-06-30

ICML.cc/2025/Workshop/CO-BUILD (poster)

openreview.net

Integrating equity, diversity, and inclusion throughout the lifecycle of artificial intelligence for healthcare: a scoping review

Ting Wang

Elham Emami

Dana Jafarpour

Raymond Tolentino

Genevieve Gore

S. A. Rahimi

The lack of Equity, Diversity, and Inclusion (EDI) principles in the lifecycle of Artificial Intelligence (AI) technologies in healthcare is… (see more) a growing concern. Despite its importance, there is still a gap in understanding the initiatives undertaken to address this issue. This review aims to explore what and how EDI principles have been integrated into the design, development, and implementation of AI studies in healthcare. We followed the scoping review framework by Levac et al. and the Joanna Briggs Institute. A comprehensive search was conducted until April 29, 2022, across MEDLINE, Embase, PsycInfo, Scopus, and SCI-EXPANDED. Only research studies in which the integration of EDI in AI was the primary focus were included. Non-research articles were excluded. Two independent reviewers screened the abstracts and full texts, resolving disagreements by consensus or by consulting a third reviewer. To synthesize the findings, we conducted a thematic analysis and used a narrative description. We adhered to the PRISMA-ScR checklist for reporting scoping reviews. The search yielded 10,664 records, with 42 studies included. Most studies were conducted on the American population. Previous research has shown that AI models improve when socio-demographic factors such as gender and race are considered. Despite frameworks for EDI integration, no comprehensive approach systematically applies EDI principles in AI model development. Additionally, the integration of EDI into the AI implementation phase remains under-explored, and the representation of EDI within AI teams has been overlooked. This review reports on what and how EDI principles have been integrated into the design, development, and implementation of AI technologies in healthcare. We used a thorough search strategy and rigorous methodology, though we acknowledge limitations such as language and publication bias. A comprehensive framework is needed to ensure that EDI principles are considered throughout the AI lifecycle. Future research could focus on strategies to reduce algorithmic bias, assess the long-term impact of EDI integration, and explore policy implications to ensure that AI technologies are ethical, responsible, and beneficial for all.

2025-06-30

PLOS Digital Health (published)

doi.org

Longitudinal changes in brain asymmetry track lifestyle and disease

Karin Saltoun

B. T. Thomas Yeo

Lynn Paul

Jörn Diedrichsen

Danilo Bzdok

Human beings may have evolved the largest asymmetries of brain organization in the animal kingdom. Hemispheric left-vs-right specialization … (see more)is especially pronounced in species-unique capacities, including emotional processing such as facial judgments, language-based feats such as reading books, and creativity such as musical performances. We hence chart the largest longitudinal brain-imaging resource, and provide evidence that brain asymmetry changes continuously in a manner suggestive of neural plasticity throughout adulthood. In the UK Biobank population cohort, we demonstrate that whole-brain patterns of asymmetry changes show robust phenome-wide associations across 959 distinct variables spanning 11 categories. We also find that changes in brain asymmetry over years co-occur with changes among specific lifestyle markers. We uncover specific brain asymmetry changes which systematically co-occur with entering a new phase of life, namely retirement. Finally, we reveal relevance of evolving brain asymmetry within subjects to major disease categories across ~4500 total medical diagnoses. Our findings speak against the idea that asymmetrical neural systems are conserved throughout adulthood.

2025-06-30

Nature Communications (published)

doi.org

Model approximation in MDPs with unbounded per-step cost

Berk Bozkurt

Aditya Mahajan

Ashutosh Nayyar

Yi Ouyang

We consider the problem of designing a control policy for an infinite-horizon discounted cost Markov decision process …

2025-06-30

IEEE Transactions on Automatic Control (published)

doi.org

arxiv.org

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Jean-Philippe Corbeil

Amin Dada

Jean-Michel Attendu

Asma Ben Abacha

Alessandro Sordoni

Lucas Caccia

Franccois Beaulieu

Thomas Lin

Jens Kleesiek

Paul Vozila

High computation costs and latency of large language models such as GPT-4 have limited their deployment in clinical settings. Small language… (see more) models (SLMs) offer a cost-effective alternative, but their limited capacity requires biomedical domain adaptation, which remains challenging. An additional bottleneck is the unavailability and high sensitivity of clinical data. To address these challenges, we propose a novel framework for adapting SLMs into high-performing clinical models. We introduce the MediPhi collection of 3.8B-parameter SLMs developed with our novel framework: pre-instruction tuning of experts on relevant medical and clinical corpora (PMC, Medical Guideline, MedWiki, etc.), model merging, and clinical-tasks alignment. To cover most clinical tasks, we extended the CLUE benchmark to CLUE+, doubling its size. Our expert models deliver relative improvements on this benchmark over the base model without any task-specific fine-tuning: 64.3% on medical entities, 49.5% on radiology reports, and 44% on ICD-10 coding (outperforming GPT-4-0125 by 14%). We unify the expert models into MediPhi via model merging, preserving gains across benchmarks. Furthermore, we built the MediFlow collection, a synthetic dataset of 2.5 million high-quality instructions on 14 medical NLP tasks, 98 fine-grained document types, and JSON format support. Alignment of MediPhi using supervised fine-tuning and direct preference optimization achieves further gains of 18.9% on average.

2025-06-30

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (published)

doi.org

arxiv.org

Modulation of leg trajectory by transcranial magnetic stimulation during walking

Héloïse Bourgeois

Rose Guay-Hottin

El-Mehdi Meftah

Marina Martinez

Marco Bonizzato

Dorothy Barthélemy

The primary motor cortex is involved in initiation and adaptive control of locomotion. However, the role of the motor cortex in controlling … (see more)gait trajectories remains unclear. In animals, cortical neuromodulation allows for precise control of step height. We hypothesized that a similar control framework applies to humans, whereby cortical stimulation would primarily increase foot elevation. Transcranial magnetic stimulation (TMS) was applied over the motor cortex to assess the involvement of the corticospinal tract over the limb trajectory during human walking. Ten healthy adults (aged 20–32 years) participated in treadmill walking at 1.5 km/h. TMS was applied over the left motor cortex at an intensity of 120% of the threshold to elicit a dorsiflexion of the right ankle during the swing phase of gait. Electromyographic (EMG) measurements and three-dimensional (3D) lower limb kinematics were collected. When delivered during the early swing phase, TMS led to a significant increase in the maximum height of the right toe by a mean of 34.9% ± 9.6% (21.4 mm ± 7.9 mm, p = 0.032) and knee height by 52.8% ± 14.1% (28.8 mm ± 7.7 mm, p = 0.0021) across participants. These findings indicate that TMS can influence limb trajectory during walking, highlighting its potential as a tool for studying cortical control of locomotion.

2025-06-30

Scientific Reports (published)

doi.org

MSR37 Improve Analyst Accuracy in Systematic Literature Reviews Using Reliant Tabular and LLM-Based Relevance Scoring

Christoph R. Schlegel

Sam Work

Bellemare Marc-Emmanuel

2025-06-30

Value in Health (published)

doi.org

Multi-Armed Sampling Problem and the End of Exploration

Mohammad Pedramfar

Siamak Ravanbakhsh

This paper introduces the framework of multi-armed sampling, as the sampling counterpart to the optimization problem of multi-arm bandits. O… (see more)ur primary motivation is to rigorously examine the exploration-exploitation trade-off in the context of sampling. We systematically define plausible notions of regret for this framework and establish corresponding lower bounds. We then propose a simple algorithm that achieves these optimal regret bounds. Our theoretical results demonstrate that in contrast to optimization, sampling does not require exploration. To further connect our findings with those of multi-armed bandits, we define a continuous family of problems and associated regret measures that smoothly interpolates and unifies multi-armed sampling and multi-armed bandit problems using a temperature parameter. We believe the multi-armed sampling framework, and our findings in this setting can have a foundational role in the study of sampling including recent neural samplers, akin to the role of multi-armed bandits in reinforcement learning. In particular, our work sheds light on the need for exploration and the convergence properties of algorithm for entropy-regularized reinforcement learning, fine-tuning of pretrained models and reinforcement learning with human feedback (RLHF).

2025-06-30

arXiv (published)

doi.org

arxiv.org

Multiscale Neural PDE Surrogates for Prediction and Downscaling: Application to Ocean Currents

Abdessamad El-Kabid

Loubna Benabbou

Redouane Lguensat

Alex Hernández-García

Accurate modeling of physical systems governed by partial differential equations is a central challenge in scientific computing. In oceanogr… (see more)aphy, high-resolution current data are critical for coastal management, environmental monitoring, and maritime safety. However, available satellite products, such as Copernicus data for sea water velocity at ~0.08 degrees spatial resolution and global ocean models, often lack the spatial granularity required for detailed local analyses. In this work, we (a) introduce a supervised deep learning framework based on neural operators for solving PDEs and providing arbitrary resolution solutions, and (b) propose downscaling models with an application to Copernicus ocean current data. Additionally, our method can model surrogate PDEs and predict solutions at arbitrary resolution, regardless of the input resolution. We evaluated our model on real-world Copernicus ocean current data and synthetic Navier-Stokes simulation datasets.

2025-06-30

arXiv (published)

doi.org

arxiv.org

Optimizers Qualitatively Alter Solutions And We Should Leverage This

Razvan Pascanu

Clare Lyle

Ionut-Vlad Modoranu

Naima Elosegui Borras

Dan Alistarh

Petar Veličković

A. Chandar

Soham De

James Martens

Due to the nonlinear nature of Deep Neural Networks (DNNs), one can not guarantee convergence to a unique global minimum of the loss when us… (see more)ing optimizers relying only on local information, such as SGD. Indeed, this was a primary source of skepticism regarding the feasibility of DNNs in the early days of the field. The past decades of progress in deep learning have revealed this skepticism to be misplaced, and a large body of empirical evidence shows that sufficiently large DNNs following standard training protocols exhibit well-behaved optimization dynamics that converge to performant solutions. This success has biased the community to use convex optimization as a mental model for learning, leading to a focus on training efficiency, either in terms of required iteration, FLOPs or wall-clock time, when improving optimizers. We argue that, while this perspective has proven extremely fruitful, another perspective specific to DNNs has received considerably less attention: the optimizer not only influences the rate of convergence, but also the qualitative properties of the learned solutions. Restated, the optimizer can and will encode inductive biases and change the effective expressivity of a given class of models. Furthermore, we believe the optimizer can be an effective way of encoding desiderata in the learning process. We contend that the community should aim at understanding the biases of already existing methods, as well as aim to build new optimizers with the explicit intent of inducing certain properties of the solution, rather than solely judging them based on their convergence rates. We hope our arguments will inspire research to improve our understanding of how the learning process can impact the type of solution we converge to, and lead to a greater recognition of optimizers design as a critical lever that complements the roles of architecture and data in shaping model outcomes.

2025-06-30

arXiv (published)

doi.org

arxiv.org

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Publications

Mila on Udemy

Disinformation 2.0: When AI Blurs the Lines

AI Policy Fellowship Publications

Popular keywords:

Publications