Publications

RAT: Bridging RNN Efficiency and Attention Accuracy in Language Modeling
Xiuying Wei
Anunay Yadav
Transformers have become the cornerstone of modern large-scale language models; however, their dependence on softmax attention poses a major… (see more) computational bottleneck, particularly in long-context settings. In this work, rather than following prevalent approaches such as linear attention (or SSMs) and local attention, we introduce an intermediate design called \rat between recurrence and attention mechanisms. It partitions the input into chunks, applies a simple linear recurrence within each chunk to capture local dependencies, and then performs softmax attention across chunks to model long-range interactions. By adjusting the size of the chunk, \rat enables flexible trade-offs, combining the strengths of RNN and attention. Empirically, with a chunk size of 16, the \rat layer achieves a \(7\times\) improvement in training speed with 100K token sequences and \(9\times\) in generation at 4K sequence length, while maintaining similar or sometimes even better accuracy compared to standard attention. We demonstrate this by training 1.3B parameter models from scratch and performing large-scale evaluations, including short- and long-context benchmarks, as well as supervised fine-tuning~(SFT). We further propose a hybrid architecture that interleaves \rat with local attention. By combining efficient long-range modeling with strong local interactions, this hybrid design not only improves inference speed and reduces cache memory usage compared to attention, but also consistently enhances performance, for example, achieving an average 1 point gain in commonsense reasoning tasks, up to 4 points on code tasks, and a 1 point Rouge-L increase in a summarization SFT task. Code is available at https://github.com/CLAIRE-Labo/RAT
DOLPHIN advances single-cell transcriptomics beyond gene level by leveraging exon and junction reads
Kailu Song
Yumin Zheng
Bowen Zhao
David H. Eidelman
The advent of single-cell sequencing has revolutionized the study of cellular dynamics, providing unprecedented resolution into the molecula… (see more)r states and heterogeneity of individual cells. However, the rich potential of exon-level information and junction reads within single cells remains underutilized. Conventional gene-count methods overlook critical exon and junction data, limiting the quality of cell representation and downstream analyses such as subpopulation identification and alternative splicing detection. We introduce DOLPHIN, a deep learning method that integrates exon-level and junction read data, representing genes as graph structures. These graphs are processed by a variational graph autoencoder to improve cell embeddings. DOLPHIN not only demonstrates superior performance in cell clustering, biomarker discovery, and alternative splicing detection but also provides a distinct capability to detect subtle transcriptomic differences at the exon level that are often masked in gene-level analyses. By examining cellular dynamics with enhanced resolution, DOLPHIN provides new insights into disease mechanisms and potential therapeutic targets.
Extracting and Following Paths for Robust Relational Reasoning with Large Language Models
Ge Zhang
Mohammad Alomrani
Jiaming Zhou
Yaochen Hu
B. Wang
Qun Liu
Mark J. Coates
Yingxue Zhang
Jianye HAO
Decoding Humor-Induced Amusement via Facial Expression Analysis: Toward Emotion-Aware Applications
Gabrielle Toupin
Marie Buffo
Clément Feyt
Golnoush Alamian
Anne-Lise Saive
Humor is widely recognized for its positive effects on well-being, including stress reduction, mood enhancement, and cognitive benefits. Yet… (see more), the lack of reliable tools to objectively quantify amusement—particularly its temporal dynamics—has limited progress in this area. Existing measures often rely on self-report or coarse summary ratings, providing little insight into how amusement unfolds over time. To address this gap, we developed a Random Forest model to predict the intensity of amusement evoked by humorous video clips, based on participants’ facial expressions—particularly the co-activation of Facial Action Units 6 and 12 (“% Smile”)—and video features such as motion, saliency, and topic. Our results show that exposure to humorous content significantly increases “% Smile”, with amusement peaking toward the end of videos. Importantly, we observed emotional carry-over effects, suggesting that consecutive humorous stimuli can sustain or amplify positive emotional responses. Even when trained solely on humorous content, the model reliably predicted amusement intensity, underscoring the robustness of our approach. Overall, this study provides a novel, objective method to track amusement on a fine temporal scale, advancing the measurement of nonverbal emotional expression. These findings may inform the design of emotion-aware applications and humor-based therapeutic interventions to promote well-being and emotional health.
Personalizing brain stimulation: continual learning for sleep spindle detection
Hugo R Jourde
S Ehsan M Bajestani
Emily B J Coffey
Objective. Personalized stimulation, in which algorithms used to detect neural events adapt to a user’s unique neural characteristics, may… (see more) be crucial to enable optimized and consistent stimulation quality for both fundamental research and clinical applications. Precise stimulation of sleep spindles-transient patterns of brain activity that occur during non rapid eye movement sleep that are involved in memory consolidation-presents an exciting frontier for studying memory functions; however, this endeavour is challenged by the spindles’ fleeting nature, inter-individual variability, and the necessity of real-time detection. Approach. We tackle these challenges using a novel continual learning framework. Using a pre-trained model capable of both online classification of sleep stages and spindle detection, we implement an algorithm that refines spindle detection, tailoring it to the individual throughout one or more nights without manual intervention. Main results. Our methodology achieves accurate, subject-specific targeting of sleep spindles and enables advanced closed-loop stimulation studies. While fine-tuning alone offers minimal benefits for single nights, our approach combining weight averaging demonstrates significant improvement over multiple nights, effectively mitigating catastrophic forgetting. Significance. This work represents an important step towards signal-level personalization of brain stimulation that can be applied to different brain stimulation paradigms including closed-loop brain stimulation, and to different neural events. Applications in fundamental neuroscience may enhance the investigative potential of brain stimulation to understand cognitive processes such as the role of sleep spindles in memory consolidation, and may lead to novel therapeutic applications.
Toward whole-genome inference of polygenic scores with fast and memory-efficient algorithms.
Chirayu Anant Haryan
Simon Gravel
Sanchit Misra
Yuemei Li
AfroBench: How Good are Large Language Models on African Languages?
Kelechi Ogueji
Pontus Stenetorp
Aligner l’intelligence artificielle avec les objectifs de développement durable (ODD) des Nations unies
Marie Zumstein
Karine Gentelet
An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles
Camille Grysole
Penelope Borduas
Isaac-Jacques Kadoch
Simon Phillips
Daniel Dufort
Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (see more)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18–43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.
AURA: A Multi-Modal Medical Agent for Understanding, Reasoning&Annotation
Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capa… (see more)ble of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.
Autoregressive Speech Enhancement via Acoustic Tokens
Yusuf Cem Sübakan
Mirco Ravanaelli
CADmium: Fine-Tuning Code Language Models for Text-Driven Sequential CAD Design
Computer-aided design (CAD) is the digital construction of 2D and 3D objects, and is central to a wide range of engineering and manufacturin… (see more)g applications like automobile and aviation. Despite its importance, CAD modeling remains largely a time-intensive, manual task. Recent works have attempted to automate this process with small transformer-based models and handcrafted CAD sequence representations. However, there has been little effort to leverage the potential of large language models (LLMs) for sequential CAD design. In this work, we introduce a new large-scale dataset of more than 170k CAD models annotated with high-quality, human-like descriptions generated with our pipeline based on GPT-4.1. Using this dataset, we fine-tune powerful code-LLMs to generate CAD sequences represented in a JSON-based format from natural language descriptions, demonstrating the viability and effectiveness of this approach for text-conditioned CAD generation. Because simple metrics often fail to reflect the quality of generated objects, we introduce geometric and topological metrics based on sphericity, mean curvature, and Euler characteristic to provide richer structural insights. Our experiments and ablation studies on both synthetic and human-annotated data demonstrate that CADmium is able to automate CAD design, drastically speeding up the design of new objects. The dataset, code, and fine-tuned models are available online.