Tal Arbel

Biography

Tal Arbel is a professor in the Department of Electrical and Computer Engineering at McGill University, where she is the director of the Probabilistic Vision Group and Medical Imaging Lab in the Centre for Intelligent Machines.

She is also a Canada CIFAR AI Chair, an associate academic member of Mila – Quebec Artificial Intelligence Institute and an associate member of the Goodman Cancer Research Centre.

Arbel’s research focuses on the development of probabilistic deep learning methods in computer vision and medical image analysis for a wide range of real-world applications, with a focus on neurological diseases.

She is a recipient of the 2019 McGill Engineering Christophe Pierre Research Award and a Fellow of the Canadian Academy of Engineering. She regularly serves on the organizing team of major international conferences in computer vision and in medical image analysis (e.g. MICCAI, MIDL, ICCV, CVPR). She is currently the Editor-in-Chief and co-founder of the arXiv overlay journal: Machine Learning for Biomedical Imaging (MELBA).

Current Students

Karl Bridi

Research Intern - McGill University

Cornelius Crijnen

PhD - McGill University

Website

Oscar Cruzhernandez

Research Intern - McGill University

Charbel El Feghali

Research Intern - McGill University

Carlotta Hoelzle

Research Intern - McGill University

Elizabeth Laura Janes

Master's Research - McGill University

Emily Kaczmarek

PhD - McGill University

Yusong Li

Collaborating researcher - McGill University University

Toky Raharison Ralambomihanta

Yik Yu Ng

Research Intern - McGill University

Research Intern - McGill University

Ryan Rezai

Master's Research - McGill University

Rachel Ruddy

Research Intern - McGill University

Minh To

Collaborating researcher - UBC

Website

PRISM: An Explainable Generative AI Model for Medical Imaging

Blog Posts

Image of an Xray and the DDIM process to generate counterfactual version of Xrays

July 1, 2025

Amar Kumar

Anita Kriz

Mohammed Havaei

Tal Arbel

Read the article

Publications

PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion

Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, da… (see more)ta imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.

2026-05-17

Medical Imaging with Deep Learning (published)

proceedings.mlr.press

Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge

Asbjørn Munk

Stefano Cerri

Vardan Nersesjan

Christian Hedeager Krag

Jakob Ambsdorf

Pedro García

Julia Machnio

Peirong Liu

Suhyun Ahn

Nasrin Akbari

Yasmina Al Khalil

Kimberly Amador

Sina Amirrajab

Meritxell Bach Cuadra

Ujjwal Baid

Bhakti Baheti

Jaume Banús

Kamil Barbierik

Christoph Brune … (see 64 more)

步岩松

Baptiste Callard

Yuhan Chen

Cornelius Crijnen

Corentin Dancette

Peter Drotár

Prasad Dutande

Nils D. Forkert

Saurabh K. Garg†

Jakub Gazda

Matej Gazda

Benoît Gérin

Partha Ghosh

Weikang Gong

Pedro M. Gordaliza

Sam Hashemi

Tobias Heimann

Fucang Jia

Jiexin Jiang

Emily Kaczmarek

Chris Kang

Seung Kwan Kang

Mohammad Khazaei

Julien Khlaut

Petros Koutsouvelis

Jae Sung Lee

Yuchong Li

Mengye Lyu

Mingchen Ma

Anant Madabhushi

Klaus H. Maier-Hein

Pierre Manceron

Andrés Martínez Mora

Moona Mazher

Felix Meister

Nataliia Molchanova

Steven A. Niederer

Leonard Nürnberg

Jinah Park

Abdul Qayyum

Jonas Richiardi

Antoine Saporta

Branislav Setlak

Ning Shen

Justin Szeto

Constantin Ulrich

Puru Vaish

Vibujithan Vigneshwaran

Leroy Volmer

Zihao Wang

Siqi Wei

Anthony Winder

Jelmer M. Wolterink

Maxence Wynen

Chang YANG

Si Young Yie

Mostafa Mehdipour Ghazi

Akshay Pai

Espen Jimenez‐Solem

Sebastian Nørgaard Llambias

Mikael Boesen

Michael Eriksen Benros

Juan Eugenio Iglesias

Mads Nielsen

Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-qualit… (see more)y labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.

2026-04-12

arXiv (preprint)

PIKACHU: Prototypical In-context Knowledge Adaptation for Clinical Heterogeneous Usage

Medical imaging systems increasingly rely on large vision language foundation models (VLFMs) trained on diverse biomedical corpora, yet thes… (see more)e models remain difficult to adapt to new clinical tasks without costly fine-tuning and large annotated datasets. We present PIKACHU (Prototypical In-Context Knowledge Adaptation for Clinical Heterogeneous Usage), a lightweight and generalizable framework that enables rapid few-shot adaptation of frozen medical FMs using only a handful of labeled examples. Unlike prior approaches that modify backbone weights or introduce heavy attention-based adapters, PIKACHU performs all task adaptation directly in the FM feature space through in-context prototypical reasoning. Given a small support set, the framework constructs class prototypes by averaging normalized embeddings from a frozen VLFM image encoder and performs prediction on query images using temperature-scaled cosine similarity. Only a single temperature parameter is learned. We evaluate PIKACHU across three heterogeneous medical imaging datasets - dermatological images (ISIC), Optical Coherence Tomography (OCT), and Diabetic Retinopathy (DR), using established vision models (SigLIP, PubMedCLIP, DinoV2, and ViT) as backbones. The proposed in-context learning (ICL) strategy consistently outperforms the baseline (zero-shot) approaches across all datasets and architectures, achieving substantial improvements in both accuracy and AUC. Notably, with PubMedCLIP as the backbone, PIKACHU achieves 0.69/0.76 (Acc./AUC) on the ISIC dataset, 0.72/0.78 on OCT, and 0.79/0.88 on DR, demonstrating robust generalization across diverse clinical imaging modalities. These results highlight the promise of feature-space in-context learning as efficient and deployable paradigms for test-time adaptation of foundation models, without the need for extensive retraining.

2026-02-13

Medical Imaging with Deep Learning (poster)

proceedings.mlr.press

Discovering Latent Graphs with GFlowNets for Diverse Conditional Image Generation

Bailey Trang

Parham Saremi

Alan Q. Wang

Fangrui Huang

Zahra TehraniNasab

Amar Kumar

Li Fei-Fei

Ehsan Adeli

Capturing diversity is crucial in conditional and prompt-based image generation, particularly when conditions contain uncertainty that can l… (see more)ead to multiple plausible outputs. To generate diverse images reflecting this diversity, traditional methods often modify random seeds, making it difficult to discern meaningful differences between samples, or diversify the input prompt, which is limited in verbally interpretable diversity. We propose Rainbow, a novel conditional image generation framework, applicable to any pretrained conditional generative model, that addresses inherent condition/prompt uncertainty and generates diverse plausible images. Rainbow is based on a simple yet effective idea: decomposing the input condition into diverse latent representations, each capturing an aspect of the uncertainty and generating a distinct image. First, we integrate a latent graph, parameterized by Generative Flow Networks (GFlowNets), into the prompt representation computation. Second, leveraging GFlowNets' advanced graph sampling capabilities to capture uncertainty and output diverse trajectories over the graph, we produce multiple trajectories that collectively represent the input condition, leading to diverse condition representations and corresponding output images. Evaluations on natural image and medical image datasets demonstrate Rainbow's improvement in both diversity and fidelity across image synthesis, image generation, and counterfactual generation tasks.

2025-12-02

Neural Information Processing Systems (Accept (poster))

openreview.net

Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models

Anita Kriz

Elizabeth Laura Janes

Xing Shen

2025-10-18

ICCVW @ IEEE/CVF International Conference on Computer Vision (published)

Imagining Alternatives: Towards High-Resolution 3D Counterfactual Medical Image Generation via Language Guidance

Mohamed Mohamed

Brennan Nichyporuk

Douglas Arnold

Vision-language models have demonstrated impressive capabilities in generating 2D images under various conditions; however the impressive pe… (see more)rformance of these models in 2D is largely enabled by extensive, readily available pretrained foundation models. Critically, comparable pretrained foundation models do not exist for 3D, significantly limiting progress in this domain. As a result, the potential of vision-language models to produce high-resolution 3D counterfactual medical images conditioned solely on natural language descriptions remains completely unexplored. Addressing this gap would enable powerful clinical and research applications, such as personalized counterfactual explanations, simulation of disease progression scenarios, and enhanced medical training by visualizing hypothetical medical conditions in realistic detail. Our work takes a meaningful step toward addressing this challenge by introducing a framework capable of generating high-resolution 3D counterfactual medical images of synthesized patients guided by free-form language prompts. We adapt state-of-the-art 3D diffusion models with enhancements from Simple Diffusion and incorporate augmented conditioning to improve text alignment and image quality. To our knowledge, this represents the first demonstration of a language-guided native-3D diffusion model applied specifically to neurological imaging data, where faithful three-dimensional modeling is essential to represent the brain's three-dimensional structure. Through results on two distinct neurological MRI datasets, our framework successfully simulates varying counterfactual lesion loads in Multiple Sclerosis (MS), and cognitive states in Alzheimer's disease, generating high-quality images while preserving subject fidelity in synthetically generated medical images. Our results lay the groundwork for prompt-driven disease progression analysis within 3D medical imaging.

2025-10-11

Lecture Notes in Computer Science (published)

Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification

Xing Shen

Justin Szeto

Mingyang Li

Hengguan Huang

2025-09-19

Lecture Notes in Computer Science (published)

Building a General SimCLR Self-Supervised Foundation Model Across Neurological Diseases to Advance 3D Brain MRI Diagnoses

3D structural Magnetic Resonance Imaging (MRI) brain scans are commonly acquired in clinical settings to monitor a wide range of neurologica… (see more)l conditions, including neurodegenerative disorders and stroke. While deep learning models have shown promising results analyzing 3D MRI across a number of brain imaging tasks, most are highly tailored for specific tasks with limited labeled data, and are not able to generalize across tasks and/or populations. The development of self-supervised learning (SSL) has enabled the creation of large medical foundation models that leverage diverse, unlabeled datasets ranging from healthy to diseased data, showing significant success in 2D medical imaging applications. However, even the very few foundation models for 3D brain MRI that have been developed remain limited in resolution, scope, or accessibility. In this work, we present a general, high-resolution SimCLR-based SSL foundation model for 3D brain structural MRI, pre-trained on 18,759 patients (44,958 scans) from 11 publicly available datasets spanning diverse neurological diseases. We compare our model to Masked Autoencoders (MAE), as well as two supervised baselines, on four diverse downstream prediction tasks in both in-distribution and out-of-distribution settings. Our fine-tuned SimCLR model outperforms all other models across all tasks. Notably, our model still achieves superior performance when fine-tuned using only 20% of labeled training samples for predicting Alzheimer's disease. We use publicly available code and data, and release our trained model at https://github.com/emilykaczmarek/3D-Neuro-SimCLR, contributing a broadly applicable and accessible foundation model for clinical brain MRI analysis.

2025-09-11

ArXiv (preprint)

SSL-AD: Spatiotemporal Self-Supervised Learning for Generalizability and Adaptability Across Alzheimer's Prediction Tasks and Datasets

Alzheimer's disease is a progressive, neurodegenerative disorder that causes memory loss and cognitive decline. While there has been extensi… (see more)ve research in applying deep learning models to Alzheimer's prediction tasks, these models remain limited by lack of available labeled data, poor generalization across datasets, and inflexibility to varying numbers of input scans and time intervals between scans. In this study, we adapt three state-of-the-art temporal self-supervised learning (SSL) approaches for 3D brain MRI analysis, and add novel extensions designed to handle variable-length inputs and learn robust spatial features. We aggregate four publicly available datasets comprising 3,161 patients for pre-training, and show the performance of our model across multiple Alzheimer's prediction tasks including diagnosis classification, conversion detection, and future conversion prediction. Importantly, our SSL model implemented with temporal order prediction and contrastive learning outperforms supervised learning on six out of seven downstream tasks. It demonstrates adaptability and generalizability across tasks and number of input images with varying time intervals, highlighting its capacity for robust performance across clinical applications. We release our code and model publicly at https://github.com/emilykaczmarek/SSL-AD.

2025-09-11

ArXiv (preprint)

Spatio-Temporal Conditional Diffusion Models for Forecasting Future Multiple Sclerosis Lesion Masks Conditioned on Treatments

Gian Mario Favero

Ge Ya Luo

Douglas Arnold

Christopher Pal

2025-08-08

ArXiv (preprint)

AURA: A Multi-Modal Medical Agent for Understanding, Reasoning&Annotation

Nima Fathi

Amar Kumar

Recent advancements in Large Language Models (LLMs) have catalyzed a paradigm shift from static prediction systems to agentic AI agents capa… (see more)ble of reasoning, interacting with tools, and adapting to complex tasks. While LLM-based agentic systems have shown promise across many domains, their application to medical imaging remains in its infancy. In this work, we introduce AURA, the first visual linguistic explainability agent designed specifically for comprehensive analysis, explanation, and evaluation of medical images. By enabling dynamic interactions, contextual explanations, and hypothesis testing, AURA represents a significant advancement toward more transparent, adaptable, and clinically aligned AI systems. We highlight the promise of agentic AI in transforming medical image analysis from static predictions to interactive decision support. Leveraging Qwen-32B, an LLM-based architecture, AURA integrates a modular toolbox comprising: (i) a segmentation suite with phase grounding, pathology segmentation, and anatomy segmentation to localize clinically meaningful regions; (ii) a counterfactual image-generation module that supports reasoning through image-level explanations; and (iii) a set of evaluation tools including pixel-wise difference-map analysis, classification, and advanced state-of-the-art components to assess diagnostic relevance and visual interpretability.

2025-07-21

ArXiv (preprint)

Pixel Perfect MegaMed: A Megapixel-Scale Vision-Language Foundation Model for Generating High Resolution Medical Images

Zahra Tehrani Nasab

Hujun Ni

Amar Kumar

2025-07-16

ArXiv (preprint)