Publications

An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles
Camille Grysole
Penelope Borduas
Isaac-Jacques Kadoch
Simon Phillips
Daniel Dufort
Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (voir plus)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.
AURA: A Multi-Modal Medical Agent for Understanding, Reasoning&Annotation
Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capa… (voir plus)ble of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.
Autoregressive Speech Enhancement via Acoustic Tokens
Yusuf Cem Sübakan
Mirco Ravanaelli
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
Computer-aided design (CAD) is the digital construction of 2D and 3D objects, and is central to a wide range of engineering and manufacturin… (voir plus)g applications like automobile and aviation. Despite its importance, CAD modeling remains largely a time-intensive, manual task. Recent works have attempted to automate this process with small transformer-based models and handcrafted CAD sequence representations. However, there has been little effort to leverage the potential of large language models (LLMs) for sequential CAD design. In this work, we introduce a new large-scale dataset of more than 170k CAD models annotated with high-quality, human-like descriptions generated with our pipeline based on GPT-4.1. Using this dataset, we fine-tune powerful code-LLMs to generate CAD sequences represented in a JSON-based format from natural language descriptions, demonstrating the viability and effectiveness of this approach for text-conditioned CAD generation. Because simple metrics often fail to reflect the quality of generated objects, we introduce geometric and topological metrics based on sphericity, mean curvature, and Euler characteristic to provide richer structural insights. Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design, drastically speeding up the design of new objects. The dataset, code, and fine-tuned models are available online.
Capacity-Constrained Continual Learning
Zheng Wen
Benjamin Van Roy
Satinder Singh
ConvNTC: convolutional neural tensor completion for detecting “A–A–B” type biological triplets
Pei Liu
Xiao Liang
Yuemei Li
Jiawei Luo
Abstract Systematically investigating interactions among molecules of the same type across different contexts is crucial for unraveling dise… (voir plus)ase mechanisms and developing potential therapeutic strategies. The “A–A–B” triplet paradigm provides a principled approach to model such context-specific interactions, and leveraging third-order tensor to capture such type ternary relationships is an efficient strategy. However, effectively modeling both multilinear and nonlinear characteristics to accurately identify such triplets using tensor-based methods remains a challenge. In this paper, we propose a novel Convolutional Neural Tensor Completion (ConvNTC) framework that collaboratively learns the multilinear and nonlinear representations to model triplet-based network interactions. ConvNTC consists of a multilinear module and a nonlinear module. The former is a tensor decomposition approach that integrates multiple constraints to learn the tensor factor embeddings. The latter contains three components: an embedding generator to produce position-specific index embeddings for each tensor entry in addition to the factor embeddings, a convolutional encoder to perform nonlinear feature mapping while preserving the tensor’s rank-one property, and a Kolmogorov–Arnold Network (KAN) based predictor to effectively capture high-dimensional relationships aligned with the intrinsic structure of real-world data. We evaluate ConvNTC on two types triplet datasets of the “A–A–B” type: miRNA–miRNA–disease and drug–drug–cell. Comprehensive experiments against 11 state-of-the-art methods demonstrate the superiority of ConvNTC in terms of triplet prediction. ConvNTC reveals promising prognostic values of the miRNA–miRNA interactions on breast cancer and detects synergistic drug combinations in cancer cell lines.
Curiosity-Driven Exploration via Temporal Contrastive Learning
Catherine Ji
Benjamin Eysenbach
Effective exploration in reinforcement learning requires keeping track not just of where the agent has been, but also of how the agent think… (voir plus)s about and represents the world: an agent should explore states that enable it to learn powerful representations. Temporal representations can include the information required to solve any potential task while avoiding the computational cost of reconstruction. In this paper, we propose an exploration method that uses temporal contrastive representations to drive exploration, maximizing coverage as seen through the lens of these temporal representations. We demonstrate complex exploration behaviors in locomotion, manipulation, and embodied-AI tasks, revealing previously unknown capabilities and behaviors once achievable only via extrinsic rewards.
Is Exploration or Optimization the Problem for Deep Reinforcement Learning?
Filter Equivariant Functions: A symmetric account of length-general extrapolation on lists
Owen Lewis
Neil Ghani
Andrew Joseph Dudzik
Christos Perivolaropoulos
From Black Box to Biomarker: Sparse Autoencoders for Interpreting Speech Models of Parkinson's Disease
Jen-Kai Chen
Roozbeh Sattari
Mirco Ravanaelli
Denise Klein
Speech holds promise as a cost-effective and non-invasive biomarker for neurological conditions such as Parkinson's disease (PD). While deep… (voir plus) learning systems trained on raw audio can find subtle signals not available from hand-crafted features, their black-box nature hinders clinical adoption. To address this, we apply sparse autoencoders (SAEs) to uncover interpretable internal representations from a speech-based PD detection system. We introduce a novel mask-based activation for adapting SAEs to small biomedical datasets, creating sparse disentangled dictionary representations. These dictionary entries are found to have strong associations with characteristic articulatory deficits in PD speech, such as reduced spectral flux and increased spectral flatness in the low-energy regions highlighted by the model attention. We further show that the spectral flux is related to volumetric measurements of the putamen from MRI scans, demonstrating the potential of SAEs to reveal clinically relevant biomarkers for disease monitoring and diagnosis.
A Geometric Lens on RL Environment Complexity Based on Ricci Curvature
We introduce Ollivier-Ricci Curvature (ORC) as an information-geometric tool for analyzing the local structure of reinforcement learning (RL… (voir plus)) environments. We establish a novel connection between ORC and the Successor Representation (SR), enabling a geometric interpretation of environment dynamics decoupled from reward signals. Our analysis shows that states with positive and negative ORC values correspond to regions where random walks converge and diverge respectively, which are often critical for effective exploration. ORC is highly correlated with established environment complexity metrics, yet integrates naturally with standard RL frameworks based on SR and provides both global and local complexity measures. Leveraging this property, we propose an ORC-based intrinsic reward that guides agents toward divergent regions and away from convergent traps. Empirical results demonstrate that our curvature-driven reward substantially improves exploration performance across diverse environments, outperforming both random and count-based intrinsic baselines.
Harnessing agent-based frameworks in CellAgentChat to unravel cell–cell interactions from single-cell and spatial transcriptomics
Understanding cell–cell interactions (CCIs) is essential yet challenging owing to the inherent intricacy and diversity of cellular dynamic… (voir plus)s. Existing approaches often analyze global patterns of CCIs using statistical frameworks, missing the nuances of individual cell behavior owing to their focus on aggregate data. This makes them insensitive in complex environments where the detailed dynamics of cell interactions matter. We introduce CellAgentChat, an agent-based model (ABM) designed to decipher CCIs from single-cell RNA sequencing and spatial transcriptomics data. This approach models biological systems as collections of autonomous agents governed by biologically inspired principles and rules. Validated across eight diverse single-cell data sets, CellAgentChat demonstrates its effectiveness in detecting intricate signaling events across different cell populations. Moreover, CellAgentChat offers the ability to generate animated visualizations of single-cell interactions and provides flexibility in modifying agent behavior rules, facilitating thorough exploration of both close and distant cellular communications. Furthermore, CellAgentChat leverages ABM features to enable intuitive in silico perturbations via agent rule modifications, facilitating the development of novel intervention strategies. This ABM method unlocks an in-depth understanding of cellular signaling interactions across various biological contexts, thereby enhancing in silico studies for cellular communication–based therapies.