An Artificial Intelligence-Based Model to Predict Pregnancy After Intrauterine Insemination: A Retrospective Analysis of 9501 Cycles.
J. Minano Masip
Camille Grysole
Penelope Borduas
I. Kadoch
Simon Phillips
Daniel Dufort
Background/Objectives: Intrauterine insemination (IUI) is a common first-line approach in the treatment of numerous infertile couples, espec… (voir plus)ially in cases of unexplained infertility. Its relatively low success rate, however, could benefit from the development of AI-based support tools to predict its outcome, thus helping the clinical management of patients undergoing IUI cycles. Our objective was to develop a robust and accurate machine learning model that predicts pregnancy outcomes following IUI. Methods: A retrospective, observational, and single-center study was conducted. In total, 3535 couples (aged 18-43 years) that underwent IUI between January 2011 and December 2015 were recruited. Twenty-one clinical and laboratory parameters of 9501 IUI cycles were used to train different machine learning algorithms. Accuracy of pregnancy outcome was evaluated by an area under the curve (AUC) analysis. Results: The linear SVM outperformed AdaBoost, Kernel SVM, Random Forest, Extreme Forest, Bagging, and Voting classifiers. Pre-wash sperm concentration, the ovarian stimulation protocol, cycle length, and maternal age were strong predictors of a positive pregnancy test following IUI (AUC = 0.78). Paternal age was found to be the worst predictor. Conclusions: Our Linear SVM model predicts a positive pregnancy outcome following IUI. Although this model shows value for the clinical management of infertile patients and informed decision-making by the patients, further validation using independent datasets is required prior to clinical implementation.
Boosting LLM Reasoning via Spontaneous Self-Correction
Tengyu Xu
Xuewei Wang
Zhengxing Chen
Di Jin
Liang Tan
Yen-Ting Lin
Zishun Yu
Zhuokai Zhao
Si-Yuan Wang
Yun He
Sinong Wang
Han Fang
MetaAI
Chen Zhu
Mila - Québec
AI Institute
Polytechnique Montréal
While large language models (LLMs) have demonstrated remarkable success on a broad range of tasks, math reasoning remains a challenging one.… (voir plus) One of the approaches for improving math reasoning is self-correction, which designs self-improving loops to let the model correct its own mistakes. However, existing self-correction approaches treat corrections as standalone post-generation refinements, relying on extra prompt and system designs to elicit self-corrections, instead of performing real-time, spontaneous self-corrections in a single pass. To address this, we propose SPOC, a spontaneous self-correction approach that enables LLMs to generate interleaved solutions and verifications in a single inference pass, with generation dynamically terminated based on verification outcomes, thereby effectively scaling inference time compute. SPOC considers a multi-agent perspective by assigning dual roles -- solution proposer and verifier -- to the same model. We adopt a simple yet effective approach to generate synthetic data for fine-tuning, enabling the model to develop capabilities for self-verification and multi-agent collaboration. We further improve its solution proposal and verification accuracy through online reinforcement learning. Experiments on mathematical reasoning benchmarks show that SPOC significantly improves performance. Notably, SPOC boosts the accuracy of Llama-3.1-8B and 70B Instruct models, achieving gains of 8.8% and 11.6% on MATH500, 10.0% and 20.0% on AMC23, and 3.3% and 6.7% on AIME24, respectively.
Discrete Feynman-Kac Correctors
Marta Skreta
Alan Aspuru-Guzik
The performance of Large Language Models (LLMs) directly depends on the size of the context that the model was trained on. Despite significa… (voir plus)nt progress in increasing the context size of the current models, some applications remain bottlenecked by the number of processed tokens at inference time. A particular mathematical problem LLMs can be used for is inferring parameters in a statistical model, given data-points as input. Here we make a case demonstrating that discrete diffusion models offer a promising avenue for scaling such parameter prediction tasks, by combining the outputs of the same model evaluated on different parts of the training data. We propose Discrete Fenyman-Kac Correctors --- a framework that allows for controlling the generated distribution of discrete masked diffusion models at inference time. We derive Sequential Monte Carlo (SMC) algorithms that, given a trained discrete diffusion model, sample from its annealed distribution or the product of distributions with different conditions. Notably, our framework does not require any training, finetuning and external reward functions. Finally, we apply our framework to amortized linear regression using LLaDA and demonstrate that it drastically outperforms the standard inference procedure in terms of accuracy and adherence to prompt format.
Instilling Parallel Reasoning into Language Models
Matthew Macfarlane
Minseon Kim
Nebojsa Jojic
Weijia Xu
Lucas Caccia
Xingdi Yuan
Wanru Zhao
Zhengyan Shi
Sequential chain-of-thought reasoning significantly improves the performance of Large language models (LLMs) on complex tasks. However, sequ… (voir plus)ential reasoning has structural limitations: Long chains are expensive due to attention's quadratic complexity, and multiple diverse strategies cannot be considered simultaneously. To address this we propose a method that instills parallel reasoning capabilities in LLMs by distilling parallel reasoning traces from a teacher model. This approach enables models to decompose problems, explore diverse strategies via concurrent reasoning traces, and aggregate trace outputs for the final answer. Evaluating on a variety of math and puzzle benchmarks such as MATH 500, AIME and Countdown, we show our approach can decompose parallelizable problems, and that the performance scales with the number of parallel traces. The resulting model can dynamically allocate reasoning strategies based on problem complexity, outperforming standard sampling methods.
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Zhanke Zhou
Xuan Li
Mikhail Galkin
Xiao Feng
Sanmi Koyejo
Bo Han
Numerous applications of large language models (LLMs) rely on their ability to perform step-by-step reasoning. However, the reasoning behavi… (voir plus)or of LLMs remains poorly understood, posing challenges to research, development, and safety. To address this gap, we introduce landscape of thoughts-the first visualization tool for users to inspect the reasoning paths of chain-of-thought and its derivatives on any multi-choice dataset. Specifically, we represent the states in a reasoning path as feature vectors that quantify their distances to all answer choices. These features are then visualized in two-dimensional plots using t-SNE. Qualitative analysis shows that the landscape of thoughts effectively distinguishes between strong and weak models, correct and incorrect answers, as well as different reasoning tasks. It also uncovers undesirable reasoning patterns, such as low consistency and high uncertainty. Additionally, users can adapt our tool to a model that predicts any property they observe. We showcase this advantage by adapting our tool to a lightweight verifier, which significantly improves reasoning by evaluating the correctness of reasoning paths. The code is publicly available at https://github.com/tmlr-group/landscape-of-thoughts.
Learning to Solve Complex Problems via Dataset Decomposition
Wanru Zhao
Lucas Caccia
Zhengyan Shi
Minseon Kim
Xingdi Yuan
Weijia Xu
Marc-Alexandre Côté
Curriculum learning is a class of training strategies that organizes the data being exposed to a model by difficulty, gradually from simpler… (voir plus) to more complex examples. This research explores a reverse curriculum generation approach that recursively decomposes complex datasets into simpler, more learnable components. We propose a teacher-student framework where the teacher is equipped with the ability to reason step-by-step, which is used to recursively generate easier versions of examples, enabling the student model to progressively master difficult tasks. We propose a novel scoring system to measure data difficulty based on its structural complexity and conceptual depth, allowing curriculum construction over decomposed data. Experiments on math datasets (MATH and AIME) demonstrate that models trained with curricula generated by our approach exhibit superior performance compared to standard training on original datasets.
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers
Celo: Training Versatile Learned Optimizers on a Compute Diet
Learned optimization has emerged as a promising alternative to hand-crafted optimizers, with the potential to discover stronger learned upda… (voir plus)te rules that enable faster, hyperparameter-free training of neural networks. A critical element for practically useful learned optimizers, that can be used off-the-shelf after meta-training, is strong meta-generalization: the ability to apply the optimizers to new tasks. Recent state-of-the-art work in learned optimizers, VeLO (Metz et al., 2022), requires a large number of highly diverse meta-training tasks along with massive computational resources, 4000 TPU months, to achieve meta-generalization. This makes further improvements to such learned optimizers impractical. In this work, we identify several key elements in learned optimizer architectures and meta-training procedures that can lead to strong meta-generalization. We also propose evaluation metrics to reliably assess quantitative performance of an optimizer at scale on a set of evaluation tasks. Our proposed approach, Celo, makes a significant leap in improving the meta-generalization performance of learned optimizers and also outperforms tuned state-of-the-art optimizers on a diverse set of out-of-distribution tasks, despite being meta-trained for just 24 GPU hours.
Towards fair decentralized benchmarking of healthcare AI algorithms with the Federated Tumor Segmentation (FeTS) challenge
Maximilian Zenk
Ujjwal Baid
Sarthak Pati
Akis Linardos
Brandon Edwards
Micah Sheller
Patrick Foley
Alejandro Aristizabal
David Zimmerer
Alexey Gruzdev
Jason Martin
Russell T. Shinohara
Annika Reinke
Fabian Isensee
Santhosh Parampottupadam
Kaushal Parekh
Ralf Floca
Hasan Kassem
Bhakti Baheti
Siddhesh Thakur … (voir 332 de plus)
Verena Chung
Kaisar Kushibar
Karim Lekadir
Meirui Jiang
Youtan Yin
Hongzheng Yang
Quande Liu
Cheng Chen
Qi Dou
Pheng-Ann Heng
Xiaofan Zhang
Shaoting Zhang
Muhammad Irfan Khan
Mohammad Ayyaz Azeem
Mojtaba Jafaritadi
Esa Alhoniemi
Elina Kontio
Suleiman A. Khan
Leon Mächler
Ivan Ezhov
Florian Kofler
Suprosanna Shit
Johannes C. Paetzold
Timo Loehr
Benedikt Wiestler
Himashi Peiris
Kamlesh Pawar
Shenjun Zhong
Zhaolin Chen
Munawar Hayat
Gary Egan
Mehrtash Harandi
Ece Isik Polat
Gorkem Polat
Altan Kocyigit
Alptekin Temizel
Anup Tuladhar
Lakshay Tyagi
Raissa Souza
Nils D. Forkert
Pauline Mouches
Matthias Wilms
Vishruth Shambhat
Akansh Maurya
Shubham Subhas Danannavar
Rohit Kalla
Vikas Kumar Anand
Ganapathy Krishnamurthi
Sahil Nalawade
Chandan Ganesh
Ben Wagner
Divya Reddy
Yudhajit Das
Fang F. Yu
Baowei Fei
B. Fei
Ananth J. Madhuranthakam
Joseph Maldjian
Gaurav Singh
Jianxun Ren
Wei Zhang
Ning An
Qingyu Hu
Youjia Zhang
Ying Zhou
Vasilis Siomos
Giacomo Tarroni
Jonathan Passerrat-Palmbach
Ambrish Rawat
Giulio Zizzo
Swanand Ravindra Kadhe
Jonathan P. Epperlein
Stefano Braghin
Yuan Wang
Renuga Kanagavelu
Qingsong Wei
Yechao Yang
Yong Liu
Krzysztof Kotowski
Szymon Adamski
Bartosz Machura
Wojciech Malara
Lukasz Zarudzki
Jakub Nalepa
Yaying Shi
Hongjian Gao
Salman Avestimehr
Yonghong Yan
Agus S. Akbar
Ekaterina Kondrateva
Hua Yang
Zhaopei Li
Hung-Yu Wu
Johannes Roth
Camillo Saueressig
Alexandre Milesi
Quoc D. Nguyen
Nathan J. Gruenhagen
Tsung-Ming Huang
Jun Ma
Har Shwinder H. Singh
Nai-Yu Pan
Dingwen Zhang
Ramy A. Zeineldin
Michal Futrega
Yading Yuan
Gian Marco Conte
GM Conte
Xue Feng
Quan D. Pham
Yong Xia
Zhifan Jiang
Huan Minh Luu
Mariia Dobko
Alexandre Carré
Bair Tuchinov
Hassan Mohy-ud-Din
Saruar Alam
Anup Singh
Nameeta Shah
Weichung Wang
Chiharu Sako
Michel Bilello
Satyam Ghodasara
Suyash Mohan
Christos Davatzikos
Evan Calabrese
Jeffrey Rudie
Javier Villanueva-Meyer
S. Cha
Soonmee Cha
Christopher Hess
John Mongan
Madhura Ingalhalikar
Manali Jadhav
Umang Pandey
Jitender Saini
Raymond Y. Huang
Ken Chang
Minh-Son To
Sargam Bhardwaj
Chee Chong
Marc Agzarian
Michal Kozubek
Filip Lux
Jan Michálek
Petr Matula
Miloš Ker^kovský
Tereza Kopr^ivová
Marek Dostál
Václav Vybíhal
Marco C. Pinho
James Holcomb
Marie Metz
Rajan Jain
Matthew D. Lee
Yvonne W. Lui
Pallavi Tiwari
Ruchika Verma
Rohan Bareja
Ipsa Yadav
Jonathan Chen
Yuriy Gusev
Krithika Bhuvaneshwar
Anousheh Sayah
Camelia Bencheqroun
Anas Belouali
Subha Madhavan
Rivka R. Colen
Aikaterini Kotrotsou
Philipp Vollmuth
Gianluca Brugnara
Chandrakanth J. Preetha
Felix Sahm
Martin Bendszus
Wolfgang Wick
Abhishek Mahajan
Carmen Balaña
Jaume Capellades
Josep Puig
Yoon Seong Choi
Seung-Koo Lee
Jong Hee Chang
Sung Soo Ahn
Hassan F. Shaykh
Alejandro Herrera-Trujillo
Maria Trujillo
William Escobar
Ana Abello
Jose Bernal
Jhon Gómez
Pamela LaMontagne
Daniel S. Marcus
Mikhail Milchenko
Arash Nazeri
BENNETT A. LANDMAN
Karthik Ramadass
Kaiwen Xu
Silky Chotai
Lola B. Chambless
Akshitkumar Mistry
Reid C. Thompson
Ashok Srinivasan
Jayapalli R. Bapuraj
J. Rajiv Bapuraj
Arvind Rao
Nicholas Wang
Ota Yoshiaki
Toshio Moritani
Sevcan Turk
Joonsang Lee
Snehal Prabhudesai
John Garrett
Matthew Larson
Robert Jeraj
Hongwei Li
H. Li
Tobias Weiss
Michael Weller
Andrea Bink
Bertrand Pouymayou
Sonam Sharma
Tzu-Chi Tseng
Saba Adabi
Alexandre Xavier Falcão
Samuel B. Martins
Bernardo C. A. Teixeira
Flávia Sprenger
David Menotti
Diego R. Lucio
Simone P. Niclou
Olivier Keunen
Ann-Christin Hau
Enrique Pelaez
Heydy Franco-Maldonado
Francis Loayza
Sebastian Quevedo
Richard McKinley
Johannes Slotboom
Piotr Radojewski
Raphael Meier
Roland Wiest
Johannes Trenkler
Josef Pichler
Georg Necker
Andreas Haunschmidt
Stephan Meckel
Pamela Guevara
Esteban Torche
Cristobal Mendoza
Franco Vera
Elvis Ríos
Eduardo López
Sergio A. Velastin
Stephen Baek
Yusung Kim
Heba Ismael
Bryan Allen
John M. Buatti
Peter Zampakis
Vasileios Panagiotopoulos
Panagiotis Tsiganos
Sotiris Alexiou
Ilias Haliassos
Evangelia I. Zacharaki
Konstantinos Moustakas
Christina Kalogeropoulou
Dimitrios M. Kardamakis
Bing Luo
Laila M. Poisson
Ning Wen
Mahdi A. L. Loutfi
David Fortin
Martin Lepage
Fanny Morón
Jacob Mandel
Gaurav Shukla
Spencer Liem
Gregory S. Alexandre
Joseph Lombardo
Joshua D. Palmer
Adam E. Flanders
Adam P. Dicker
Godwin Ogbole
Dotun Oyekunle
Olubunmi Odafe-Oyibotha
Babatunde Osobu
Mustapha Shu’aibu Hikima
Mayowa Soneye
Farouk Dako
Adeleye Dorcas
Derrick Murcia
Eric Fu
Rourke Haas
John A. Thompson
David Ryan Ormond
Stuart Currie
Kavi Fatania
Russell Frood
Amber L. Simpson
Jacob J. Peoples
Ricky Hu
Danielle Cutler
Fabio Y. Moraes
Anh Tran
Mohammad Hamghalam
Michael A. Boss
James Gimpel
Deepak Kattil Veettil
Kendall Schmidt
Lisa Cimino
Cynthia Price
Brian Bialecki
Sailaja Marella
Charles Apgar
Andras Jakab
Marc-André Weber
Errol Colak
Jens Kleesiek
John Freymann
Justin Kirby
Lena Maier-Hein
Jake Albrecht
Peter Mattson
Alexandros Karargyris
Prashant Shah
Bjoern Menze
Klaus Maier-Hein
Spyridon Bakas
Computational competitions are the standard for benchmarking medical image analysis algorithms, but they typically use small curated test da… (voir plus)tasets acquired at a few centers, leaving a gap to the reality of diverse multicentric patient data. To this end, the Federated Tumor Segmentation (FeTS) Challenge represents the paradigm for real-world algorithmic performance evaluation. The FeTS challenge is a competition to benchmark (i) federated learning aggregation algorithms and (ii) state-of-the-art segmentation algorithms, across multiple international sites. Weight aggregation and client selection techniques were compared using a multicentric brain tumor dataset in realistic federated learning simulations, yielding benefits for adaptive weight aggregation, and efficiency gains through client sampling. Quantitative performance evaluation of state-of-the-art segmentation algorithms on data distributed internationally across 32 institutions yielded good generalization on average, albeit the worst-case performance revealed data-specific modes of failure. Similar multi-site setups can help validate the real-world utility of healthcare AI algorithms in the future.
DOLPHIN advances single-cell transcriptomics beyond gene level by leveraging exon and junction reads
Kailu Song
Yumin Zheng
Bowen Zhao
David H. Eidelman
Assemblies, synapse clustering, and network topology interact with plasticity to explain structure-function relationships of the cortical connectome
András Ecker
Daniela Egas Santander
Marwan Abdellah
Jorge Blanco Alonso
Sirio Bolaños-Puchet
Giuseppe Chindemi
James B. Isbister
James King
Pramod Kumbhar
Ioannis Magkanaris
Michael W. Reimann
Synaptic plasticity underlies the brain’s ability to learn and adapt. While experiments in brain slices have revealed mechanisms and proto… (voir plus)cols for the induction of plasticity between pairs of neurons, how these synaptic changes are coordinated in biological neuronal networks to ensure the emergence of learning remains poorly understood. Simulation and modeling have emerged as important tools to study learning in plastic networks, but have yet to achieve a scale that incorporates realistic network structure, active dendrites, and multi-synapse interactions, key determinants of synaptic plasticity. To rise to this challenge, we endowed an existing large-scale cortical network model, incorporating data-constrained dendritic processing and multi-synaptic connections, with a calcium-based model of functional plasticity that captures the diversity of excitatory connections extrapolated to in vivo-like conditions. This allowed us to study how dendrites and network structure interact with plasticity to shape stimulus representations at the microcircuit level. In our exploratory simulations, plasticity acted sparsely and specifically, firing rates and weight distributions remained stable without additional homeostatic mechanisms. At the circuit level, we found plasticity was driven by co-firing stimulus-evoked functional assemblies, spatial clustering of synapses on dendrites, and the topology of the network connectivity. As a result of the plastic changes, the network became more reliable with more stimulus-specific responses. We confirmed our testable predictions in the MICrONS datasets, an openly available electron microscopic reconstruction of a large volume of cortical tissue. Our results quantify at a large scale how the dendritic architecture and higher-order structure of cortical microcircuits play a central role in functional plasticity and provide a foundation for elucidating their role in learning.
Curiosity-Driven Exploration via Temporal Contrastive Learning
Catherine Ji
Benjamin Eysenbach
Effective exploration in reinforcement learning requires keeping track not just of where the agent has been, but also of how the agent think… (voir plus)s about and represents the world: an agent should explore states that enable it to learn powerful representations. Temporal representations can include the information required to solve any potential task while avoiding the computational cost of reconstruction. In this paper, we propose an exploration method that uses temporal contrastive representations to drive exploration, maximizing coverage as seen through the lens of these temporal representations. We demonstrate complex exploration behaviors in locomotion, manipulation, and embodied-AI tasks, revealing previously unknown capabilities and behaviors once achievable only via extrinsic rewards.