Emily Kaczmarek

Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge

Asbjørn Munk

Stefano Cerri

Vardan Nersesjan

Christian Hedeager Krag

Jakob Ambsdorf

Pedro García

Julia Machnio

Peirong Liu

Suhyun Ahn

Nasrin Akbari

Yasmina Al Khalil

Kimberly Amador

Sina Amirrajab

Tal Arbel

Meritxell Bach Cuadra

Ujjwal Baid

Bhakti Baheti

Jaume Banús

Kamil Barbierik

Christoph Brune … (voir 64 de plus)

步岩松

Baptiste Callard

Yuhan Chen

Cornelius Crijnen

Corentin Dancette

Peter Drotár

Prasad Dutande

Nils D. Forkert

Saurabh K. Garg†

Jakub Gazda

Matej Gazda

Benoît Gérin

Partha Ghosh

Weikang Gong

Pedro M. Gordaliza

Sam Hashemi

Tobias Heimann

Fucang Jia

Jiexin Jiang

Emily Kaczmarek

Chris Kang

Seung Kwan Kang

Mohammad Khazaei

Julien Khlaut

Petros Koutsouvelis

Jae Sung Lee

Yuchong Li

Mengye Lyu

Mingchen Ma

Anant Madabhushi

Klaus H. Maier-Hein

Pierre Manceron

Andrés Martínez Mora

Moona Mazher

Felix Meister

Nataliia Molchanova

Steven A. Niederer

Leonard Nürnberg

Jinah Park

Abdul Qayyum

Jonas Richiardi

Antoine Saporta

Branislav Setlak

Ning Shen

Justin Szeto

Constantin Ulrich

Puru Vaish

Vibujithan Vigneshwaran

Leroy Volmer

Zihao Wang

Siqi Wei

Anthony Winder

Jelmer M. Wolterink

Maxence Wynen

Chang YANG

Si Young Yie

Mostafa Mehdipour Ghazi

Akshay Pai

Espen Jimenez‐Solem

Sebastian Nørgaard Llambias

Mikael Boesen

Michael Eriksen Benros

Juan Eugenio Iglesias

Mads Nielsen

Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-qualit… (voir plus)y labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.

2026-04-12

arXiv (prépublication)

doi.org

arxiv.org

PIKACHU: Prototypical In-context Knowledge Adaptation for Clinical Heterogeneous Usage

Medical imaging systems increasingly rely on large vision language foundation models (VLFMs) trained on diverse biomedical corpora, yet thes… (voir plus)e models remain difficult to adapt to new clinical tasks without costly fine-tuning and large annotated datasets. We present PIKACHU (Prototypical In-Context Knowledge Adaptation for Clinical Heterogeneous Usage), a lightweight and generalizable framework that enables rapid few-shot adaptation of frozen medical FMs using only a handful of labeled examples. Unlike prior approaches that modify backbone weights or introduce heavy attention-based adapters, PIKACHU performs all task adaptation directly in the FM feature space through in-context prototypical reasoning. Given a small support set, the framework constructs class prototypes by averaging normalized embeddings from a frozen VLFM image encoder and performs prediction on query images using temperature-scaled cosine similarity. Only a single temperature parameter is learned. We evaluate PIKACHU across three heterogeneous medical imaging datasets - dermatological images (ISIC), Optical Coherence Tomography (OCT), and Diabetic Retinopathy (DR), using established vision models (SigLIP, PubMedCLIP, DinoV2, and ViT) as backbones. The proposed in-context learning (ICL) strategy consistently outperforms the baseline (zero-shot) approaches across all datasets and architectures, achieving substantial improvements in both accuracy and AUC. Notably, with PubMedCLIP as the backbone, PIKACHU achieves 0.69/0.76 (Acc./AUC) on the ISIC dataset, 0.72/0.78 on OCT, and 0.79/0.88 on DR, demonstrating robust generalization across diverse clinical imaging modalities. These results highlight the promise of feature-space in-context learning as efficient and deployable paradigms for test-time adaptation of foundation models, without the need for extensive retraining.

2026-02-13

Medical Imaging with Deep Learning (poster)

proceedings.mlr.press

Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses

3D structural Magnetic Resonance Imaging (MRI) brain scans are commonly acquired in clinical settings to monitor a wide range of neurologica… (voir plus)l conditions, including neurodegenerative disorders and stroke. While deep learning models have shown promising results analyzing 3D MRI across a number of brain imaging tasks, most are highly tailored for specific tasks with limited labeled data, and are not able to generalize across tasks and/or populations. The development of self-supervised learning (SSL) has enabled the creation of large medical foundation models that leverage diverse, unlabeled datasets ranging from healthy to diseased data, showing significant success in 2D medical imaging applications. However, even the very few foundation models for 3D brain MRI that have been developed remain limited in resolution, scope, or accessibility. In this work, we present a general, high-resolution SimCLR-based SSL foundation model for 3D brain structural MRI, pre-trained on 18,759 patients (44,958 scans) from 11 publicly available datasets spanning diverse neurological diseases. We compare our model to Masked Autoencoders (MAE), as well as two supervised baselines, on four diverse downstream prediction tasks in both in-distribution and out-of-distribution settings. Our fine-tuned SimCLR model outperforms all other models across all tasks. Notably, our model still achieves superior performance when fine-tuned using only 20% of labeled training samples for predicting Alzheimer's disease. We use publicly available code and data, and release our trained model at https://github.com/emilykaczmarek/3D-Neuro-SimCLR, contributing a broadly applicable and accessible foundation model for clinical brain MRI analysis.

2025-09-11

ArXiv (prépublication)

doi.org

arxiv.org

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Alzheimer's disease is a progressive, neurodegenerative disorder that causes memory loss and cognitive decline. While there has been extensi… (voir plus)ve research in applying deep learning models to Alzheimer's prediction tasks, these models remain limited by lack of available labeled data, poor generalization across datasets, and inflexibility to varying numbers of input scans and time intervals between scans. In this study, we adapt three state-of-the-art temporal self-supervised learning (SSL) approaches for 3D brain MRI analysis, and add novel extensions designed to handle variable-length inputs and learn robust spatial features. We aggregate four publicly available datasets comprising 3,161 patients for pre-training, and show the performance of our model across multiple Alzheimer's prediction tasks including diagnosis classification, conversion detection, and future conversion prediction. Importantly, our SSL model implemented with temporal order prediction and contrastive learning outperforms supervised learning on six out of seven downstream tasks. It demonstrates adaptability and generalizability across tasks and number of input images with varying time intervals, highlighting its capacity for robust performance across clinical applications. We release our code and model publicly at https://github.com/emilykaczmarek/SSL-AD.

2025-09-11

ArXiv (prépublication)

doi.org

arxiv.org

Conditional Diffusion Models are Medical Image Classifiers that Provide Explainability and Uncertainty for Free

Gian Mario Favero

2025-03-26

Medical Imaging with Deep Learning (présentation orale)