Publications

How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models

Dharshan Kumaran

Stephen M Fleming

Larisa Markeeva

Joseph Heyward

Andrea Banino

Mrinal Mathur

Razvan Pascanu

Simon Kayode Osindero

Benedetto De Martino

Petar Veličković

Viorica Patraucean

Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers wh… (voir plus)ilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two mechanisms -- a drive to maintain consistency with prior commitments and hypersensitivity to contradictory feedback -- parsimoniously capture LLM behavior in a different domain. Together, these findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and excessive sensitivity to criticism.

2025-07-03

ArXiv (prépublication)

doi.org

arxiv.org

A Novel Sequential Framework for Transmission Network Expansion Planning: Benders Decomposition Preceding Semidefinite Programming

Elmira Fathipasandideh

Hussein Suprême

Hanane Dagdougui

Dalal Asber

The transmission network expansion planning (TNEP) problem is inherently complex because of its nonlinear and nonconvex nature, arising from… (voir plus) the inclusion of AC power flow constraints, discrete investment decisions, and multiple operating scenarios. These characteristics make the problem computationally challenging, particulary when scaling to larger systems with multistage planning horizons. Addressing this complexity requires advanced methodologies that balance the solution accuracy and computational efficiency. This paper presents a novel two-step framework for TNEP that first applies Benders decomposition to separate investment and operational decisions, followed by semidefinite linearization to reformulate the operational subproblems. The proposed approach enhances the solution quality by ensuring convexity in the subproblems and improves computational efficiency through decomposition. Numerical results for 6- , 10-, and 24-bus test systems demonstrate that the proposed method achieves superior performance compared to existing approaches in terms of solution accuracy and computational efficiency.

2025-07-03

2025 IEEE Kiel PowerTech (publié)

doi.org

Toward whole-genome inference of polygenic scores with fast and memory-efficient algorithms.

Shadi Zabad

Chirayu Anant Haryan

Simon Gravel

Sanchit Misra

Yue Li

2025-07-03

American Journal of Human Genetics (publié)

doi.org

AfroBench: How Good are Large Language Models on African Languages?

Jessica Ojo

Kelechi Ogueji

Pontus Stenetorp

David Ifeoluwa Adelani

2025-07-01

Findings of the Association for Computational Linguistics: ACL 2025 (publié)

doi.org

arxiv.org

Aligner l’intelligence artificielle avec les objectifs de développement durable (ODD) des Nations unies

Marie Zumstein

Catherine Régis

Karine Gentelet

2025-07-01

(publié)

doi.org

Aligner l’intelligence artificielle avec les objectifs de développement durable (ODD) des Nations unies

Marie Zumstein

Catherine Régis

Karine Gentelet

2025-07-01

(publié)

doi.org

An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles

Jaume Minano Masip

Camille Grysole

Penelope Borduas

Isaac-Jacques Kadoch

Simon Phillips

Doina Precup

Daniel Dufort

Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (voir plus)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.

2025-07-01

Journal of Personalized Medicine (publié)

doi.org

An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles

J. Minano Masip

Camille Grysole

Penelope Borduas

I. Kadoch

Simon Phillips

Doina Precup

Daniel Dufort

Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (voir plus)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.

2025-07-01

Journal of Personalized Medicine (publié)

doi.org

AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation

Nima Fathi

Amar Kumar

Tal Arbel

Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capa… (voir plus)ble of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.

2025-07-01

arXiv (publié)

doi.org

Autoregressive Speech Enhancement via Acoustic Tokens

Luca Della Libera

Cem Subakan

Mirco Ravanelli

2025-07-01

arXiv (publié)

doi.org

arxiv.org

CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design

Prashant Govindarajan

Davide Baldelli

Jay Pathak

Quentin Fournier

Sarath Chandar

Computer-aided design (CAD) is the digital construction of 2D and 3D objects, and is central to a wide range of engineering and manufacturin… (voir plus)g applications like automobile and aviation. Despite its importance, CAD modeling remains largely a time-intensive, manual task. Recent works have attempted to automate this process with small transformer-based models and handcrafted CAD sequence representations. However, there has been little effort to leverage the potential of large language models (LLMs) for sequential CAD design. In this work, we introduce a new large-scale dataset of more than 170k CAD models annotated with high-quality, human-like descriptions generated with our pipeline based on GPT-4.1. Using this dataset, we fine-tune powerful code-LLMs to generate CAD sequences represented in a JSON-based format from natural language descriptions, demonstrating the viability and effectiveness of this approach for text-conditioned CAD generation. Because simple metrics often fail to reflect the quality of generated objects, we introduce geometric and topological metrics based on sphericity, mean curvature, and Euler characteristic to provide richer structural insights. Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design, drastically speeding up the design of new objects. The dataset, code, and fine-tuned models are available online.

2025-07-01

arXiv (publié)

doi.org

arxiv.org

Capacity-Constrained Continual Learning

Zheng Wen

Doina Precup

Benjamin Van Roy

Satinder Singh

2025-07-01

arXiv (publié)

doi.org