The Mila AI Policy Fellowship translates deep AI expertise into rigorous, public-interest policy. Read the newest publication Bridging the Expertise Gap: Knowledge Transfer Mechanisms for AI Regulation by Moritz von Knebel
This program supports AI startups at any time of the year. Benefit from cutting-edge resources and tailored support to accelerate your technology's development.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
How Overconfidence in Initial Choices and Underconfidence Under Criticism Modulate Change of Mind in Large Language Models
Large language models (LLMs) exhibit strikingly conflicting behaviors: they can appear steadfastly overconfident in their initial answers wh… (see more)ilst at the same time being prone to excessive doubt when challenged. To investigate this apparent paradox, we developed a novel experimental paradigm, exploiting the unique ability to obtain confidence estimates from LLMs without creating memory of their initial judgments -- something impossible in human participants. We show that LLMs -- Gemma 3, GPT4o and o1-preview -- exhibit a pronounced choice-supportive bias that reinforces and boosts their estimate of confidence in their answer, resulting in a marked resistance to change their mind. We further demonstrate that LLMs markedly overweight inconsistent compared to consistent advice, in a fashion that deviates qualitatively from normative Bayesian updating. Finally, we demonstrate that these two mechanisms -- a drive to maintain consistency with prior commitments and hypersensitivity to contradictory feedback -- parsimoniously capture LLM behavior in a different domain. Together, these findings furnish a mechanistic account of LLM confidence that explains both their stubbornness and excessive sensitivity to criticism.
The transmission network expansion planning (TNEP) problem is inherently complex because of its nonlinear and nonconvex nature, arising from… (see more) the inclusion of AC power flow constraints, discrete investment decisions, and multiple operating scenarios. These characteristics make the problem computationally challenging, particulary when scaling to larger systems with multistage planning horizons. Addressing this complexity requires advanced methodologies that balance the solution accuracy and computational efficiency. This paper presents a novel two-step framework for TNEP that first applies Benders decomposition to separate investment and operational decisions, followed by semidefinite linearization to reformulate the operational subproblems. The proposed approach enhances the solution quality by ensuring convexity in the subproblems and improves computational efficiency through decomposition. Numerical results for 6- , 10-, and 24-bus test systems demonstrate that the proposed method achieves superior performance compared to existing approaches in terms of solution accuracy and computational efficiency.
Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (see more)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.
ConvNTC: convolutional neural tensor completion for detecting “A–A–B” type biological triplets
Pei Liu
Xiao Liang
Yuemei Li
Jiawei Luo
Abstract Systematically investigating interactions among molecules of the same type across different contexts is crucial for unraveling dise… (see more)ase mechanisms and developing potential therapeutic strategies. The “A–A–B” triplet paradigm provides a principled approach to model such context-specific interactions, and leveraging third-order tensor to capture such type ternary relationships is an efficient strategy. However, effectively modeling both multilinear and nonlinear characteristics to accurately identify such triplets using tensor-based methods remains a challenge. In this paper, we propose a novel Convolutional Neural Tensor Completion (ConvNTC) framework that collaboratively learns the multilinear and nonlinear representations to model triplet-based network interactions. ConvNTC consists of a multilinear module and a nonlinear module. The former is a tensor decomposition approach that integrates multiple constraints to learn the tensor factor embeddings. The latter contains three components: an embedding generator to produce position-specific index embeddings for each tensor entry in addition to the factor embeddings, a convolutional encoder to perform nonlinear feature mapping while preserving the tensor’s rank-one property, and a Kolmogorov–Arnold Network (KAN) based predictor to effectively capture high-dimensional relationships aligned with the intrinsic structure of real-world data. We evaluate ConvNTC on two types triplet datasets of the “A–A–B” type: miRNA–miRNA–disease and drug–drug–cell. Comprehensive experiments against 11 state-of-the-art methods demonstrate the superiority of ConvNTC in terms of triplet prediction. ConvNTC reveals promising prognostic values of the miRNA–miRNA interactions on breast cancer and detects synergistic drug combinations in cancer cell lines.
Effective exploration in reinforcement learning requires keeping track not just of where the agent has been, but also of how the agent think… (see more)s about and represents the world: an agent should explore states that enable it to learn powerful representations. Temporal representations can include the information required to solve any potential task while avoiding the computational cost of reconstruction. In this paper, we propose an exploration method that uses temporal contrastive representations to drive exploration, maximizing coverage as seen through the lens of these temporal representations. We demonstrate complex exploration behaviors in locomotion, manipulation, and embodied-AI tasks, revealing previously unknown capabilities and behaviors once achievable only via extrinsic rewards.
We introduce Ollivier-Ricci Curvature (ORC) as an information-geometric tool for analyzing the local structure of reinforcement learning (RL… (see more)) environments. We establish a novel connection between ORC and the Successor Representation (SR), enabling a geometric interpretation of environment dynamics decoupled from reward signals. Our analysis shows that states with positive and negative ORC values correspond to regions where random walks converge and diverge respectively, which are often critical for effective exploration. ORC is highly correlated with established environment complexity metrics, yet integrates naturally with standard RL frameworks based on SR and provides both global and local complexity measures. Leveraging this property, we propose an ORC-based intrinsic reward that guides agents toward divergent regions and away from convergent traps. Empirical results demonstrate that our curvature-driven reward substantially improves exploration performance across diverse environments, outperforming both random and count-based intrinsic baselines.