Tal Arbel

Biographie

Tal Arbel est professeure titulaire au Département de génie électrique et informatique de l’Université McGill, où elle dirige le groupe de vision probabiliste et le laboratoire d'imagerie médicale du Centre sur les machines intelligentes.

Elle est titulaire d'une chaire en IA Canada-CIFAR et membre associée de Mila – Institut québécois d’intelligence artificielle ainsi que du Centre de recherche sur le cancer Goodman. Les recherches de la professeure Arbel portent sur le développement de méthodes probabilistes d'apprentissage profond dans les domaines de la vision par ordinateur et de l’analyse d'imagerie médicale pour un large éventail d'applications dans le monde réel, avec un accent particulier sur les maladies neurologiques.

Elle a remporté le prix de la recherche Christophe Pierre 2019 de McGill Engineering et est Fellow à l'Académie canadienne d'ingénierie. Elle fait régulièrement partie de l'équipe organisatrice de grandes conférences internationales sur la vision par ordinateur et l'analyse d'imagerie médicale (par exemple celles de la Medical Image Computing and Computer-Assisted Intervention Society/MICCAI et de Medical Imaging with Deep Learning/MIDL, l’International Conference on Computer Vision/ICCV ou encore la Conference on Computer Vision and Pattern Recognition/CVPR). Elle est rédactrice en chef et cofondatrice de la revue Machine Learning for Biomedical Imaging (MELBA).

Étudiants actuels

Karl Bridi

Stagiaire de recherche - McGill

Doctorat - McGill

Stagiaire de recherche - McGill

Elizabeth Laura Janes

Maîtrise recherche - McGill

Emily Kaczmarek

Doctorat - McGill

Amar Kumar

Collaborateur·rice alumni - McGill

Toky Raharison Ralambomihanta

Yik Yu Ng

Stagiaire de recherche - McGill

Baccalauréat - McGill

Ryan Rezai

Maîtrise recherche - McGill

Rachel Ruddy

Stagiaire de recherche - McGill

Parham Saremi

Maîtrise recherche - McGill

Xing Shen

Maîtrise recherche - McGill

Minh To

Collaborateur·rice de recherche - UBC

PRISM : Un modèle d'IA générative explicable pour l'imagerie médicale

Billets de blogue

Image of an Xray and the DDIM process to generate counterfactual version of Xrays

1 juillet 2025

par

Amar Kumar

Anita Kriz

Mohammed Havaei

Tal Arbel

Lire l'article

Publications

Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation

Jillian Cardinell

Justin Szeto

Raghav Mehta

Jean-Pierre R. Falet

Douglas L. Arnold

Sotirios A. Tsaftaris

Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, wh… (voir plus)ere unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.

2022-10-30

ArXiv (prépublication)

QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results

Raghav Mehta

Angelos Filos

Ujjwal Baid

Chiharu Sako

Richard McKinley

Michael Rebsamen

Katrin Datwyler

Raphaël Meier

Piotr Radojewski

Gowtham Krishnan Murugesan

Sahil Nalawade

Chandan Ganesh

Ben Wagner

Fang Yu

Baowei Fei

Ananth J. Madhuranthakam

Joseph A. Maldjian

Laura Daza

Catalina Gómez

Pablo Arbeláez … (voir 72 de plus)

Chengliang Dai

Shuo Wang

Hadrien Reynaud

Yuan-han Mo

Elsa D. Angelini

Yike Guo

Wenjia Bai

Subhashis Banerjee

Lin-min Pei

Murat AK

Sarahi Rosas-González

Ilyess Zemmoura

Clovis Tauber

Minh H. Vu

Tufve Nyholm

Tommy Löfstedt

Laura Mora Ballestar

Verónica Vilaplana

Hugh McHugh

Gonzalo D. Maso Talou

Alan Wang

Jay Patel

Ken Chang

Katharina Hoebel

Mishka Gidwani

Nishanth Arun

Sharut Gupta

Mehak Aggarwal

Praveer Singh

Elizabeth R. Gerstner

Jayashree Kalpathy-Cramer

Nicolas Boutry

Alexis Huard

Lasitha Vidyaratne

Md Monibor Rahman

Khan M. Iftekharuddin

Joseph Chazalon

Élodie Puybareau

Guillaume Tochon

Jun Ma

Mariano Cabezas

Xavier Lladó

Arnau Oliver

Liliana Patricia Marlés Valencia

Sergi Valverde

Mehdi Amian

Mohammadreza Soltaninejad

Andriy Myronenko

Ali Hatamizadeh

Xue Feng

Fang Yu

Nicholas Tustison

Craig H. Meyer

Nisarg A. Shah

Sanjay N. Talbar

Marc‐André Weber

Abhishek Mahajan

Andras Jakab

Roland Wiest

Hassan M. Fathallah‐Shaykh

Arash Nazeri

Mikhail Milchenko

Daniel C. Marcus

Aikaterini Kotrotsou

Rivka R. Colen

John Freymann

Justin Kirby

Christos Davatzikos

Bjoern Menze

Spyridon Bakas

Yarin Gal

Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain… (voir plus) Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Several uncertainty estimation methods have recently been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher percentage of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, highlighting the need for uncertainty quantification in medical image analyses. Finally, in favor of transparency and reproducibility, our evaluation code is made publicly available at https://github.com/RagMeh11/QU-BraTS.

2022-08-25

Machine Learning for Biomedical Imaging (publié)

Heatmap Regression for Lesion Detection using Pointwise Annotations

Chelsea Myers-colet

Julien Schroeter

Douglas Arnold

In many clinical contexts, detecting all lesions is imperative for evaluating disease activity. Standard approaches pose lesion detection as… (voir plus) a segmentation problem despite the time-consuming nature of acquiring segmentation labels. In this paper, we present a lesion detection method which relies only on point labels. Our model, which is trained via heatmap regression, can detect a variable number of lesions in a probabilistic manner. In fact, our proposed post-processing method offers a reliable way of directly estimating the lesion existence uncertainty. Experimental results on Gad lesion detection show our point-based method performs competitively compared to training on expensive segmentation labels. Finally, our detection model provides a suitable pre-training for segmentation. When fine-tuning on only 17 segmentation samples, we achieve comparable performance to training with the full dataset.

2022-08-10

ArXiv (prépublication)

Counterfactual Image Synthesis for Discovery of Personalized Predictive Image Markers

Amar Kumar

Anjun Hu

Jean-Pierre R. Falet

Douglas Arnold

Sotirios A. Tsaftaris

2022-08-02

ArXiv (prépublication)

Information Gain Sampling for Active Learning in Medical Image Classification

Raghav Mehta

Changjian Shui

2022-07-31

ArXiv (prépublication)

GP.2 Deep learning prediction of response to disease modifying therapy in primary progressive multiple sclerosis

JR Falet

Joshua D. Durso-Finley

Julien Schroeter

Francesca Bovis

M Sormani

D Precup

DL Arnold

Background: Only one disease modifying therapy (DMT), ocrelizumab, was found to slow disability progression in primary progressive multiple … (voir plus)sclerosis (PPMS). Modeling the conditional average treatment effect (CATE) using deep learning could identify individuals more responsive to DMTs, allowing for predictive enrichment to increase the power of future clinical trials. Methods: Baseline clinical and MRI data were acquired as part of three placebo-controlled randomized clinical trials: ORATORIO (ocrelizumab), OLYMPUS (rituximab) and ARPEGGIO (laquinimod). Data from ORATORIO and OLYMPUS was separated into a training (70%) and testing (30%) set, while ARPEGGIO served as additional validation. An ensemble of multitask multilayer perceptrons was trained to predict the rate of disability progression on both treatment and placebo to estimate CATE. Results: The model could separate individuals based on their predicted treatment effect. The top 25% of individuals predicted to respond most have a larger effect size (HR 0.442, p=0.0497) than the entire group (HR 0.787, p=0.292). The model could also identify responders to laquinimod. A simulated study where the 50% most responsive individuals are randomized would require 6-times less participants to detect a significant effect. Conclusions: Individuals with PPMS who respond favourably to DMTs can be identified using deep learning based on their baseline clinical and imaging characteristics.

2022-06-23

Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques (publié)

Metrics Reloaded - A new recommendation framework for biomedical image analysis validation

Annika Reinke

Lena Maier-Hein

Evangelia Christodoulou

Ben Glocker

Patrick Scholz

Fabian Isensee

Jens Kleesiek

Michal Kozubek

Mauricio Reyes

Michael Alexander Riegler

Manuel Wiesenfarth

Michael Baumgartner

Matthias Eisenmann

DOREEN HECKMANN-NÖTZEL

Ali Emre Kavur

TIM RÄDSCH

Minu D. Tizabi

Laura Acion

Michela Antonelli

Tal Arbel … (voir 48 de plus)

Spyridon Bakas

Peter Bankhead

Arriel Benis

M. Jorge Cardoso

Veronika Cheplygina

Beth A Cimini

Gary S. Collins

Keyvan Farahani

Bram van Ginneken

Fred A Hamprecht

Daniel A. Hashimoto

Michael M. Hoffman

Merel Huisman

Pierre Jannin

Charles Kahn

Alexandros Karargyris

Alan Karthikesalingam

Hannes Kenngott

Annette Kopp-Schneider

Anna Kreshuk

Tahsin Kurc

Bennett Landman

GEERT LITJENS

Amin Madani

Klaus Maier-Hein

Anne Martel

Peter Mattson

Erik Meijering

Bjoern Menze

David Moher

Karel G.M. Moons

Henning Müller

Felix Nickel

Jens Petersen

Nasir Rajpoot

Nicola Rieke

Julio Saez-Rodriguez

Clara I. Sánchez

Shravya Shetty

Maarten van Smeden

Carole H. Sudre

Ronald M. Summers

Abdel A. Taha

Sotirios A. Tsaftaris

Ben Van Calster

Gael Varoquaux

Paul F Jaeger

Meaningful performance assessment of biomedical image analysis algorithms depends on objective and appropriate performance metrics. There ar… (voir plus)e major shortcomings in the current state of the art. Yet, so far limited attention has been paid to practical pitfalls associated when using particular metrics for image analysis tasks. Therefore, a number of international initiatives have collaborated to offer researchers with guidance and tools for selecting performance metrics in a problem-aware manner. In our proposed framework, the characteristics of the given biomedical problem are first captured in a problem fingerprint, which identifies properties related to domain interests, the target structure(s), the input datasets, and algorithm output. A problem category-specific mapping is applied in the second step to match fingerprints to metrics that reflect domain requirements. Based on input from experts from more than 60 institutions worldwide, we believe our metric recommendation framework to be useful to the MIDL community and to enhance the quality of biomedical image analysis algorithm validation.

2022-05-08

MIDL.io/2022/Conference/Short (publié)

openreview.net

Deep Learning Prediction of Response to Disease Modifying Therapy in Primary Progressive Multiple Sclerosis (P1-1.Virtual)

Jean-Pierre René Falet

Joshua Durso-finley

Julien Schroeter

Francesca Bovis

Maria-Pia Sormani

Doina Precup

Douglas Arnold

2022-05-02

Neurology (inconnu)

On Learning Fairness and Accuracy on Multiple Subgroups

Changjian Shui

Gezheng Xu

Qi CHEN

Jiaqi Li

Charles Ling

Boyu Wang

Christian Gagné

We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of g… (voir plus)roup sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. Specifically, the subgroup specific predictors are learned in the lower-level through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.

2021-12-31

Advances in Neural Information Processing Systems 35 (NeurIPS 2022) (publié)

openreview.net

Estimating individual treatment effect on disability progression in multiple sclerosis using deep learning

Jean-Pierre R. Falet

Joshua Durso-finley

Julien Schroeter

Francesca Bovis

Maria-Pia Sormani

Doina Precup

Douglas Lorne Arnold

Disability progression in multiple sclerosis remains resistant to treatment. The absence of a suitable biomarker to allow for phase 2 clinic… (voir plus)al trials presents a high barrier for drug development. We propose to enable short proof-of-concept trials by increasing statistical power using a deep-learning predictive enrichment strategy. Specifically, a multi-headed multilayer perceptron is used to estimate the conditional average treatment effect (CATE) using baseline clinical and imaging features, and patients predicted to be most responsive are preferentially randomized into a trial. Leveraging data from six randomized clinical trials ( n = 3,830), we first pre-trained the model on the subset of relapsing-remitting MS patients ( n = 2,520), then fine-tuned it on a subset of primary progressive MS (PPMS) patients ( n = 695). In a separate held-out test set of PPMS patients randomized to anti-CD20 antibodies or placebo ( n = 297), the average treatment effect was larger for the 50% (HR, 0.492; 95% CI, 0.266-0.912; p = 0.0218) and 30% (HR, 0.361; 95% CI, 0.165-0.79; p = 0.008) predicted to be most responsive, compared to 0.743 (95% CI, 0.482-1.15; p = 0.179) for the entire group. The same model could also identify responders to laquinimod in another held-out test set of PPMS patients ( n = 318). Finally, we show that using this model for predictive enrichment results in important increases in power.

2021-10-31

medRxiv (prépublication)

Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation

Jillian Cardinell

Justin Szeto

Raghav Mehta

Sotirios Tsaftaris

Douglas L. Arnold

Many automatic machine learning models developed for focal pathology (e.g. lesions, tumours) detection and segmentation perform well, but do… (voir plus) not generalize as well to new patient cohorts, impeding their widespread adoption into real clinical contexts. One strategy to create a more diverse, generalizable training set is to naively pool datasets from different cohorts. Surprisingly, training on this \it{big data} does not necessarily increase, and may even reduce, overall performance and model generalizability, due to the existence of cohort biases that affect label distributions. In this paper, we propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets, which we call Source-Conditioned Instance Normalization (SCIN). Through extensive experimentation on three different, large scale, multi-scanner, multi-centre Multiple Sclerosis (MS) clinical trial MRI datasets, we show that our cohort bias adaptation method (1) improves performance of the network on pooled datasets relative to naively pooling datasets and (2) can quickly adapt to a new cohort by fine-tuning the instance normalization parameters, thus learning the new cohort bias with only 10 labelled samples.

2021-09-20

Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health (publié)

Nazanin Mohammadi Sepahvand

HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images

Saverio Vadacchino

Raghav Mehta

James J Clark

Segmentation of enhancing tumours or lesions from MRI is important for detecting new disease activity in many clinical contexts. However, ac… (voir plus)curate segmentation requires the inclusion of medical images (e.g., T1 post contrast MRI) acquired after injecting patients with a contrast agent (e.g., Gadolinium), a process no longer thought to be safe. Although a number of modality-agnostic segmentation networks have been developed over the past few years, they have been met with limited success in the context of enhancing pathology segmentation. In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference. In particular, we apply HAD-Net to the challenging task of enhancing tumour segmentation when access to post-contrast imaging is not available. The proposed network is trained and tested on the BraTS 2019 brain tumour segmentation challenge dataset, where it achieves performance improvements in the ranges of 16% - 26% over (a) recent modality-agnostic segmentation methods (U-HeMIS, U-HVED), (b) KD-Net adapted to this problem, (c) the pre-trained student network and (d) a non-hierarchical version of the network (AD-Net), in terms of Dice scores for enhancing tumour (ET). The network also shows improvements in tumour core (TC) Dice scores. Finally, the network outperforms both the baseline student network and AD-Net in terms of uncertainty quantification for enhancing tumour segmentation based on the BraTs 2019 uncertainty challenge metrics. Our code is publicly available at: https://github.com/SaverioVad/HAD_Net

2021-08-24

Proceedings of the Fourth Conference on Medical Imaging with Deep Learning (publié)