Portrait de Tal Arbel

Tal Arbel

Membre académique principal
Chaire en IA Canada-CIFAR
Professeure titulaire, McGill University, Département de génie électrique et informatique
Sujets de recherche
Apprentissage automatique médical
Apprentissage de représentations
Apprentissage profond
Causalité
Modèles génératifs
Modèles probabilistes
Vision par ordinateur

Biographie

Tal Arbel est professeure titulaire au Département de génie électrique et informatique de l’Université McGill, où elle dirige le groupe de vision probabiliste et le laboratoire d'imagerie médicale du Centre sur les machines intelligentes.

Elle est titulaire d'une chaire en IA Canada-CIFAR et membre associée de Mila – Institut québécois d’intelligence artificielle ainsi que du Centre de recherche sur le cancer Goodman. Les recherches de la professeure Arbel portent sur le développement de méthodes probabilistes d'apprentissage profond dans les domaines de la vision par ordinateur et de l’analyse d'imagerie médicale pour un large éventail d'applications dans le monde réel, avec un accent particulier sur les maladies neurologiques.

Elle a remporté le prix de la recherche Christophe Pierre 2019 de McGill Engineering et est Fellow à l'Académie canadienne d'ingénierie. Elle fait régulièrement partie de l'équipe organisatrice de grandes conférences internationales sur la vision par ordinateur et l'analyse d'imagerie médicale (par exemple celles de la Medical Image Computing and Computer-Assisted Intervention Society/MICCAI et de Medical Imaging with Deep Learning/MIDL, l’International Conference on Computer Vision/ICCV ou encore la Conference on Computer Vision and Pattern Recognition/CVPR). Elle est rédactrice en chef et cofondatrice de la revue Machine Learning for Biomedical Imaging (MELBA).

Étudiants actuels

Stagiaire de recherche - McGill
Doctorat - McGill
Stagiaire de recherche - McGill
Maîtrise recherche - McGill
Doctorat - McGill
Collaborateur·rice alumni - McGill
Stagiaire de recherche - McGill
Baccalauréat - McGill
Maîtrise recherche - McGill
Stagiaire de recherche - McGill
Maîtrise recherche - McGill
Maîtrise recherche - McGill
Collaborateur·rice de recherche - UBC

Publications

Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation
Raghav Mehta
Jean-Pierre R. Falet
Douglas L. Arnold
Sotirios A. Tsaftaris
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, wh… (voir plus)ere unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results
Raghav Mehta
Angelos Filos
Ujjwal Baid
Chiharu Sako
Richard McKinley
Michael Rebsamen
Katrin Datwyler
Raphaël Meier
Piotr Radojewski
Gowtham Krishnan Murugesan
Sahil Nalawade
Chandan Ganesh
Ben Wagner
Fang Yu
Baowei Fei
Ananth J. Madhuranthakam
Joseph A. Maldjian
Laura Daza
Catalina Gómez
Pablo Arbeláez … (voir 72 de plus)
Chengliang Dai
Shuo Wang
Hadrien Reynaud
Yuan-han Mo
Elsa D. Angelini
Yike Guo
Wenjia Bai
Subhashis Banerjee
Lin-min Pei
Murat AK
Sarahi Rosas-González
Ilyess Zemmoura
Clovis Tauber
Minh H. Vu
Tufve Nyholm
Tommy Löfstedt
Laura Mora Ballestar
Verónica Vilaplana
Hugh McHugh
Gonzalo D. Maso Talou
Alan Wang
Jay Patel
Ken Chang
Katharina Hoebel
Mishka Gidwani
Nishanth Arun
Mehak Aggarwal
Praveer Singh
Elizabeth R. Gerstner
Jayashree Kalpathy-Cramer
Nicolas Boutry
Alexis Huard
Lasitha Vidyaratne
Md Monibor Rahman
Khan M. Iftekharuddin
Joseph Chazalon
Élodie Puybareau
Guillaume Tochon
Jun Ma
Mariano Cabezas
Xavier Lladó
Arnau Oliver
Liliana Patricia Marlés Valencia
Sergi Valverde
Mehdi Amian
Mohammadreza Soltaninejad
Andriy Myronenko
Ali Hatamizadeh
Xue Feng
Fang Yu
Nicholas Tustison
Craig H. Meyer
Nisarg A. Shah
Sanjay N. Talbar
Marc‐André Weber
Abhishek Mahajan
Andras Jakab
Roland Wiest
Hassan M. Fathallah‐Shaykh
Arash Nazeri
Mikhail Milchenko
Daniel C. Marcus
Aikaterini Kotrotsou
Rivka R. Colen
John Freymann
Justin Kirby
Christos Davatzikos
Bjoern Menze
Spyridon Bakas
Yarin Gal
Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain… (voir plus) Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying the reliability of DL model predictions in the form of uncertainties could enable clinical review of the most uncertain regions, thereby building trust and paving the way toward clinical translation. Several uncertainty estimation methods have recently been introduced for DL medical image segmentation tasks. Developing scores to evaluate and compare the performance of uncertainty measures will assist the end-user in making more informed decisions. In this study, we explore and evaluate a score developed during the BraTS 2019 and BraTS 2020 task on uncertainty quantification (QU-BraTS) and designed to assess and rank uncertainty estimates for brain tumor multi-compartment segmentation. This score (1) rewards uncertainty estimates that produce high confidence in correct assertions and those that assign low confidence levels at incorrect assertions, and (2) penalizes uncertainty measures that lead to a higher percentage of under-confident correct assertions. We further benchmark the segmentation uncertainties generated by 14 independent participating teams of QU-BraTS 2020, all of which also participated in the main BraTS segmentation task. Overall, our findings confirm the importance and complementary value that uncertainty estimates provide to segmentation algorithms, highlighting the need for uncertainty quantification in medical image analyses. Finally, in favor of transparency and reproducibility, our evaluation code is made publicly available at https://github.com/RagMeh11/QU-BraTS.
Heatmap Regression for Lesion Detection using Pointwise Annotations
Julien Schroeter
Douglas Arnold
In many clinical contexts, detecting all lesions is imperative for evaluating disease activity. Standard approaches pose lesion detection as… (voir plus) a segmentation problem despite the time-consuming nature of acquiring segmentation labels. In this paper, we present a lesion detection method which relies only on point labels. Our model, which is trained via heatmap regression, can detect a variable number of lesions in a probabilistic manner. In fact, our proposed post-processing method offers a reliable way of directly estimating the lesion existence uncertainty. Experimental results on Gad lesion detection show our point-based method performs competitively compared to training on expensive segmentation labels. Finally, our detection model provides a suitable pre-training for segmentation. When fine-tuning on only 17 segmentation samples, we achieve comparable performance to training with the full dataset.
Counterfactual Image Synthesis for Discovery of Personalized Predictive Image Markers
Anjun Hu
Jean-Pierre R. Falet
Douglas Arnold
Sotirios A. Tsaftaris
Information Gain Sampling for Active Learning in Medical Image Classification
GP.2 Deep learning prediction of response to disease modifying therapy in primary progressive multiple sclerosis
JR Falet
Joshua D. Durso-Finley
Julien Schroeter
Francesca Bovis
M Sormani
D Precup
DL Arnold
Background: Only one disease modifying therapy (DMT), ocrelizumab, was found to slow disability progression in primary progressive multiple … (voir plus)sclerosis (PPMS). Modeling the conditional average treatment effect (CATE) using deep learning could identify individuals more responsive to DMTs, allowing for predictive enrichment to increase the power of future clinical trials. Methods: Baseline clinical and MRI data were acquired as part of three placebo-controlled randomized clinical trials: ORATORIO (ocrelizumab), OLYMPUS (rituximab) and ARPEGGIO (laquinimod). Data from ORATORIO and OLYMPUS was separated into a training (70%) and testing (30%) set, while ARPEGGIO served as additional validation. An ensemble of multitask multilayer perceptrons was trained to predict the rate of disability progression on both treatment and placebo to estimate CATE. Results: The model could separate individuals based on their predicted treatment effect. The top 25% of individuals predicted to respond most have a larger effect size (HR 0.442, p=0.0497) than the entire group (HR 0.787, p=0.292). The model could also identify responders to laquinimod. A simulated study where the 50% most responsive individuals are randomized would require 6-times less participants to detect a significant effect. Conclusions: Individuals with PPMS who respond favourably to DMTs can be identified using deep learning based on their baseline clinical and imaging characteristics.
Metrics Reloaded - A new recommendation framework for biomedical image analysis validation
Annika Reinke
Lena Maier-Hein
Evangelia Christodoulou
Ben Glocker
Patrick Scholz
Fabian Isensee
Jens Kleesiek
Michal Kozubek
Mauricio Reyes
Michael Alexander Riegler
Manuel Wiesenfarth
Michael Baumgartner
Matthias Eisenmann
DOREEN HECKMANN-NÖTZEL
Ali Emre Kavur
TIM RÄDSCH
Minu D. Tizabi
Laura Acion
Michela Antonelli
Spyridon Bakas
Peter Bankhead
Arriel Benis
M. Jorge Cardoso
Veronika Cheplygina
Beth A Cimini
Gary S. Collins
Keyvan Farahani
Bram van Ginneken
Fred A Hamprecht
Daniel A. Hashimoto
Michael M. Hoffman
Merel Huisman
Pierre Jannin
Charles Kahn
Alexandros Karargyris
Alan Karthikesalingam
Hannes Kenngott
Annette Kopp-Schneider
Anna Kreshuk
Tahsin Kurc
Bennett Landman
GEERT LITJENS
Amin Madani
Klaus Maier-Hein
Anne Martel
Peter Mattson
Erik Meijering
Bjoern Menze
David Moher
Karel G.M. Moons
Henning Müller
Felix Nickel
Jens Petersen
Nasir Rajpoot
Nicola Rieke
Julio Saez-Rodriguez
Clara I. Sánchez
Shravya Shetty
Maarten van Smeden
Carole H. Sudre
Ronald M. Summers
Abdel A. Taha
Sotirios A. Tsaftaris
Ben Van Calster
Paul F Jaeger
Meaningful performance assessment of biomedical image analysis algorithms depends on objective and appropriate performance metrics. There ar… (voir plus)e major shortcomings in the current state of the art. Yet, so far limited attention has been paid to practical pitfalls associated when using particular metrics for image analysis tasks. Therefore, a number of international initiatives have collaborated to offer researchers with guidance and tools for selecting performance metrics in a problem-aware manner. In our proposed framework, the characteristics of the given biomedical problem are first captured in a problem fingerprint, which identifies properties related to domain interests, the target structure(s), the input datasets, and algorithm output. A problem category-specific mapping is applied in the second step to match fingerprints to metrics that reflect domain requirements. Based on input from experts from more than 60 institutions worldwide, we believe our metric recommendation framework to be useful to the MIDL community and to enhance the quality of biomedical image analysis algorithm validation.
Deep Learning Prediction of Response to Disease Modifying Therapy in Primary Progressive Multiple Sclerosis (P1-1.Virtual)
Jean-Pierre René Falet
Julien Schroeter
Francesca Bovis
Maria-Pia Sormani
Douglas Arnold
On Learning Fairness and Accuracy on Multiple Subgroups
Gezheng Xu
Qi CHEN
Jiaqi Li
Charles Ling
Boyu Wang
We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of g… (voir plus)roup sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. Specifically, the subgroup specific predictors are learned in the lower-level through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.
Estimating individual treatment effect on disability progression in multiple sclerosis using deep learning
Jean-Pierre R. Falet
Julien Schroeter
Francesca Bovis
Maria-Pia Sormani
Douglas Lorne Arnold
Disability progression in multiple sclerosis remains resistant to treatment. The absence of a suitable biomarker to allow for phase 2 clinic… (voir plus)al trials presents a high barrier for drug development. We propose to enable short proof-of-concept trials by increasing statistical power using a deep-learning predictive enrichment strategy. Specifically, a multi-headed multilayer perceptron is used to estimate the conditional average treatment effect (CATE) using baseline clinical and imaging features, and patients predicted to be most responsive are preferentially randomized into a trial. Leveraging data from six randomized clinical trials ( n  = 3,830), we first pre-trained the model on the subset of relapsing-remitting MS patients ( n  = 2,520), then fine-tuned it on a subset of primary progressive MS (PPMS) patients ( n  = 695). In a separate held-out test set of PPMS patients randomized to anti-CD20 antibodies or placebo ( n  = 297), the average treatment effect was larger for the 50% (HR, 0.492; 95% CI, 0.266-0.912; p  = 0.0218) and 30% (HR, 0.361; 95% CI, 0.165-0.79; p  = 0.008) predicted to be most responsive, compared to 0.743 (95% CI, 0.482-1.15; p  = 0.179) for the entire group. The same model could also identify responders to laquinimod in another held-out test set of PPMS patients ( n  = 318). Finally, we show that using this model for predictive enrichment results in important increases in power.
Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation
Raghav Mehta
Sotirios Tsaftaris
Douglas L. Arnold
Many automatic machine learning models developed for focal pathology (e.g. lesions, tumours) detection and segmentation perform well, but do… (voir plus) not generalize as well to new patient cohorts, impeding their widespread adoption into real clinical contexts. One strategy to create a more diverse, generalizable training set is to naively pool datasets from different cohorts. Surprisingly, training on this \it{big data} does not necessarily increase, and may even reduce, overall performance and model generalizability, due to the existence of cohort biases that affect label distributions. In this paper, we propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets, which we call Source-Conditioned Instance Normalization (SCIN). Through extensive experimentation on three different, large scale, multi-scanner, multi-centre Multiple Sclerosis (MS) clinical trial MRI datasets, we show that our cohort bias adaptation method (1) improves performance of the network on pooled datasets relative to naively pooling datasets and (2) can quickly adapt to a new cohort by fine-tuning the instance normalization parameters, thus learning the new cohort bias with only 10 labelled samples.
HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images
Segmentation of enhancing tumours or lesions from MRI is important for detecting new disease activity in many clinical contexts. However, ac… (voir plus)curate segmentation requires the inclusion of medical images (e.g., T1 post contrast MRI) acquired after injecting patients with a contrast agent (e.g., Gadolinium), a process no longer thought to be safe. Although a number of modality-agnostic segmentation networks have been developed over the past few years, they have been met with limited success in the context of enhancing pathology segmentation. In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference. In particular, we apply HAD-Net to the challenging task of enhancing tumour segmentation when access to post-contrast imaging is not available. The proposed network is trained and tested on the BraTS 2019 brain tumour segmentation challenge dataset, where it achieves performance improvements in the ranges of 16% - 26% over (a) recent modality-agnostic segmentation methods (U-HeMIS, U-HVED), (b) KD-Net adapted to this problem, (c) the pre-trained student network and (d) a non-hierarchical version of the network (AD-Net), in terms of Dice scores for enhancing tumour (ET). The network also shows improvements in tumour core (TC) Dice scores. Finally, the network outperforms both the baseline student network and AD-Net in terms of uncertainty quantification for enhancing tumour segmentation based on the BraTs 2019 uncertainty challenge metrics. Our code is publicly available at: https://github.com/SaverioVad/HAD_Net