Publications

How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

Dharshan Kumaran

Stephen M Fleming

Larisa Markeeva

Joseph Heyward

Andrea Banino

Mrinal Mathur

Razvan Pascanu

Simon Kayode Osindero

Benedetto De Martino

Petar Veličković

Viorica Patraucean

Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers wh… (see more)ilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two mechanisms -- a drive to maintain consistency with prior commitments and hypersensitivity to contradictory feedback -- parsimoniously capture LLM behavior in a different domain. Together, these findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and excessive sensitivity to criticism.

2025-07-02

ArXiv (preprint)

doi.org

arxiv.org

A Novel Sequential Framework for Transmission Network Expansion Planning: Benders Decomposition Preceding Semidefinite Programming

Elmira Fathipasandideh

Hussein Suprême

Hanane Dagdougui

Dalal Asber

The transmission network expansion planning (TNEP) problem is inherently complex because of its nonlinear and nonconvex nature, arising from… (see more) the inclusion of AC power flow constraints, discrete investment decisions, and multiple operating scenarios. These characteristics make the problem computationally challenging, particulary when scaling to larger systems with multistage planning horizons. Addressing this complexity requires advanced methodologies that balance the solution accuracy and computational efficiency. This paper presents a novel two-step framework for TNEP that first applies Benders decomposition to separate investment and operational decisions, followed by semidefinite linearization to reformulate the operational subproblems. The proposed approach enhances the solution quality by ensuring convexity in the subproblems and improves computational efficiency through decomposition. Numerical results for 6- , 10-, and 24-bus test systems demonstrate that the proposed method achieves superior performance compared to existing approaches in terms of solution accuracy and computational efficiency.

2025-07-02

2025 IEEE Kiel PowerTech (published)

doi.org

Toward whole-genome inference of polygenic scores with fast and memory-efficient algorithms.

Shadi Zabad

Chirayu Anant Haryan

Simon Gravel

Sanchit Misra

Yuemei Li

2025-07-02

American Journal of Human Genetics (published)

doi.org

AfroBench: How Good are Large Language Models on African Languages?

Jessica Ojo

Kelechi Ogueji

Pontus Stenetorp

David Ifeoluwa Adelani

2025-06-30

Findings of the Association for Computational Linguistics: ACL 2025 (published)

doi.org

arxiv.org

Aligner l’intelligence artificielle avec les objectifs de développement durable (ODD) des Nations unies

Marie Zumstein

Catherine Régis

Karine Gentelet

2025-06-30

(published)

doi.org

An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles

Jaume Minano Masip

Camille Grysole

Penelope Borduas

Isaac-Jacques Kadoch

Simon Phillips

Doina Precup

Daniel Dufort

Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (see more)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.

2025-06-30

Journal of Personalized Medicine (published)

doi.org

Combining Domain and Alignment Vectors Provides Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar

Quentin Fournier

Matthew D Riemer

Pin-Yu Chen

Amal Zouaq

Payel Das

A. Chandar

2025-06-30

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (published)

doi.org

ConvNTC: convolutional neural tensor completion for detecting “A–A–B” type biological triplets

Pei Liu

Xiao Liang

Yuemei Li

Jiawei Luo

Abstract Systematically investigating interactions among molecules of the same type across different contexts is crucial for unraveling dise… (see more)ase mechanisms and developing potential therapeutic strategies. The “A–A–B” triplet paradigm provides a principled approach to model such context-specific interactions, and leveraging third-order tensor to capture such type ternary relationships is an efficient strategy. However, effectively modeling both multilinear and nonlinear characteristics to accurately identify such triplets using tensor-based methods remains a challenge. In this paper, we propose a novel Convolutional Neural Tensor Completion (ConvNTC) framework that collaboratively learns the multilinear and nonlinear representations to model triplet-based network interactions. ConvNTC consists of a multilinear module and a nonlinear module. The former is a tensor decomposition approach that integrates multiple constraints to learn the tensor factor embeddings. The latter contains three components: an embedding generator to produce position-specific index embeddings for each tensor entry in addition to the factor embeddings, a convolutional encoder to perform nonlinear feature mapping while preserving the tensor’s rank-one property, and a Kolmogorov–Arnold Network (KAN) based predictor to effectively capture high-dimensional relationships aligned with the intrinsic structure of real-world data. We evaluate ConvNTC on two types triplet datasets of the “A–A–B” type: miRNA–miRNA–disease and drug–drug–cell. Comprehensive experiments against 11 state-of-the-art methods demonstrate the superiority of ConvNTC in terms of triplet prediction. ConvNTC reveals promising prognostic values of the miRNA–miRNA interactions on breast cancer and detects synergistic drug combinations in cancer cell lines.

2025-06-30

Briefings Bioinform. (published)

doi.org

Curiosity-Driven Exploration via Temporal Contrastive Learning

Faisal Mohamed

Catherine Ji

Benjamin Eysenbach

Glen Berseth

Effective exploration in reinforcement learning requires keeping track not just of where the agent has been, but also of how the agent think… (see more)s about and represents the world: an agent should explore states that enable it to learn powerful representations. Temporal representations can include the information required to solve any potential task while avoiding the computational cost of reconstruction. In this paper, we propose an exploration method that uses temporal contrastive representations to drive exploration, maximizing coverage as seen through the lens of these temporal representations. We demonstrate complex exploration behaviors in locomotion, manipulation, and embodied-AI tasks, revealing previously unknown capabilities and behaviors once achievable only via extrinsic rewards.

2025-06-30

rl-conference.cc/RLC/2025/Workshop/RLBrew (published)

openreview.net

Is Exploration or Optimization the Problem for Deep Reinforcement Learning?

Glen Berseth

2025-06-30

rl-conference.cc/RLC/2025/Workshop/Finding_the_Frame (published)

openreview.net

Filter Equivariant Functions: A symmetric account of length-general extrapolation on lists

Owen Lewis

Neil Ghani

Andrew Joseph Dudzik

Christos Perivolaropoulos

Razvan Pascanu

Petar Veličković

2025-06-30

arXiv (published)

doi.org

A Geometric Lens on RL Environment Complexity Based on Ricci Curvature

Ali Saheb Pasand

Pablo Samuel Castro

Pouya Bashivan

We introduce Ollivier-Ricci Curvature (ORC) as an information-geometric tool for analyzing the local structure of reinforcement learning (RL… (see more)) environments. We establish a novel connection between ORC and the Successor Representation (SR), enabling a geometric interpretation of environment dynamics decoupled from reward signals. Our analysis shows that states with positive and negative ORC values correspond to regions where random walks converge and diverge respectively, which are often critical for effective exploration. ORC is highly correlated with established environment complexity metrics, yet integrates naturally with standard RL frameworks based on SR and provides both global and local complexity measures. Leveraging this property, we propose an ORC-based intrinsic reward that guides agents toward divergent regions and away from convergent traps. Empirical results demonstrate that our curvature-driven reward substantially improves exploration performance across diverse environments, outperforming both random and count-based intrinsic baselines.

2025-06-30

rl-conference.cc/RLC/2025/Workshop/Finding_the_Frame (published)

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Publications