Portrait of Tal Arbel

Tal Arbel

Core Academic Member
Canada CIFAR AI Chair
Full Professor, McGill University, Department of Electrical and Computer Engineering
Research Topics
Causality
Computer Vision
Deep Learning
Generative Models
Medical Machine Learning
Probabilistic Models
Representation Learning

Biography

Tal Arbel is a professor in the Department of Electrical and Computer Engineering at McGill University, where she is the director of the Probabilistic Vision Group and Medical Imaging Lab in the Centre for Intelligent Machines.

She is also a Canada CIFAR AI Chair, an associate academic member of Mila – Quebec Artificial Intelligence Institute and an associate member of the Goodman Cancer Research Centre.

Arbel’s research focuses on the development of probabilistic deep learning methods in computer vision and medical image analysis for a wide range of real-world applications, with a focus on neurological diseases.

She is a recipient of the 2019 McGill Engineering Christophe Pierre Research Award and a Fellow of the Canadian Academy of Engineering. She regularly serves on the organizing team of major international conferences in computer vision and in medical image analysis (e.g. MICCAI, MIDL, ICCV, CVPR). She is currently the Editor-in-Chief and co-founder of the arXiv overlay journal: Machine Learning for Biomedical Imaging (MELBA).

Current Students

Collaborating researcher - Université de Montréal
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University
PhD - McGill University
PhD - McGill University
Master's Research - McGill University
Undergraduate - McGill University
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University
Master's Research - McGill University

Publications

On Learning Fairness and Accuracy on Multiple Subgroups
Changjian Shui
Gezheng Xu
Qi CHEN
Jiaqi Li
Charles Ling
Boyu Wang
We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of g… (see more)roup sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. Specifically, the subgroup specific predictors are learned in the lower-level through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.
Estimating treatment effect for individuals with progressive multiple sclerosis using deep learning
JR Falet
Joshua D. Durso-Finley
Jan Schroeter
Francesca Bovis
Maria-Pia Sormani
Douglas Arnold
Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation
Jillian L. Cardinell
Raghav Mehta
Sotirios A. Tsaftaris
Douglas Arnold
HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images
Saverio Vadacchino
Raghav Mehta
James J. Clark
Segmentation of enhancing tumours or lesions from MRI is important for detecting new disease activity in many clinical contexts. However, ac… (see more)curate segmentation requires the inclusion of medical images (e.g., T1 post-contrast MRI) acquired after injecting patients with a contrast agent (e.g., Gadolinium), a process no longer thought to be safe. Although a number of modality-agnostic segmentation networks have been developed over the past few years, they have been met with limited success in the context of enhancing pathology segmentation. In this work, we present HAD-Net, a novel offline adversarial knowledge distillation (KD) technique, whereby a pre-trained teacher segmentation network, with access to all MRI sequences, teaches a student network, via hierarchical adversarial training, to better overcome the large domain shift presented when crucial images are absent during inference. In particular, we apply HAD-Net to the challenging task of enhancing tumour segmentation when access to post-contrast imaging is not available. The proposed network is trained and tested on the BraTS 2019 brain tumour segmentation challenge dataset, where it achieves performance improvements in the ranges of 16% - 26% over (a) recent modality-agnostic segmentation methods (U-HeMIS, U-HVED), (b) KD-Net adapted to this problem, (c) the pre-trained student network and (d) a non-hierarchical version of the network (AD-Net), in terms of Dice scores for enhancing tumour (ET). The network also shows improvements in tumour core (TC) Dice scores. Finally, the network outperforms both the baseline student network and AD-Net in terms of uncertainty quantification for enhancing tumour segmentation based on the BraTS 2019 uncertainty challenge metrics. Our code is publicly available at: https://github.com/SaverioVad/HAD_Net
Common limitations of performance metrics in biomedical image analysis
Annika Reinke
Matthias Eisenmann
Minu Dietlinde Tizabi
Carole H. Sudre
TIM RÄDSCH
Michela Antonelli
Spyridon Bakas
M. Jorge Cardoso
Veronika Cheplygina
Keyvan Farahani
Ben Glocker
DOREEN HECKMANN-NÖTZEL
Fabian Isensee
Pierre Jannin
Charles Kahn
Jens Kleesiek
Tahsin Kurc
Michal Kozubek
Bennett Landman … (see 15 more)
GEERT LITJENS
Klaus Maier-Hein
Anne Martel
Bjoern Menze
Henning Müller
Jens Petersen
Mauricio Reyes
Nicola Rieke
Bram Stieltjes
Ronald M. Summers
Sotirios A. Tsaftaris
Bram van Ginneken
Annette Kopp-Schneider
Paul Jäger
Lena Maier-Hein
Optimizing Operating Points for High Performance Lesion Detection and Segmentation Using Lesion Size Reweighting
There are many clinical contexts which require accurate detection and segmentation of all focal pathologies (e.g. lesions, tumours) in patie… (see more)nt images. In cases where there are a mix of small and large lesions, standard binary cross entropy loss will result in better segmentation of large lesions at the expense of missing small ones. Adjusting the operating point to accurately detect all lesions generally leads to oversegmentation of large lesions. In this work, we propose a novel reweighing strategy to eliminate this performance gap, increasing small pathology detection performance while maintaining segmentation accuracy. We show that our reweighing strategy vastly outperforms competing strategies based on experiments on a large scale, multi-scanner, multi-center dataset of Multiple Sclerosis patient images.
Common Limitations of Image Processing Metrics: A Picture Story
Annika Reinke
Matthias Eisenmann
Minu Dietlinde Tizabi
Carole H. Sudre
TIM RÄDSCH
Michela Antonelli
Spyridon Bakas
M. Jorge Cardoso
Veronika Cheplygina
Keyvan Farahani
B. Glocker
DOREEN HECKMANN-NÖTZEL
Fabian Isensee
Pierre Jannin
Charles E. Jr. Kahn
Jens Kleesiek
Tahsin Kurc
Michal Kozubek
Bennett Landman … (see 14 more)
GEERT LITJENS
Klaus Maier-Hein
Bjoern Menze
Henning Müller
Jens Petersen
Mauricio Reyes
Nicola Rieke
Bram Stieltjes
R. Summers
Sotirios A. Tsaftaris
Bram van Ginneken
Annette Kopp-Schneider
PAUL F. JÄGER
Lena Maier-Hein
Task dependent deep LDA pruning of neural networks
Qing Tian
James J. Clark
Accounting for Variance in Machine Learning Benchmarks
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.
Deep LDA-Pruned Nets for Efficient Facial Gender Classification
Qing Tian
James J. Clark
Many real-time tasks, such as human-computer interac-tion, require fast and efficient facial gender classification. Although deep CNN nets… (see more) have been very effective for a mul-titude of classification tasks, their high space and time de-mands make them impractical for personal computers and mobile devices without a powerful GPU. In this paper, we develop a 16-layer, yet lightweight, neural network which boosts efficiency while maintaining high accuracy. Our net is pruned from the VGG-16 model [35] starting from the last convolutional (conv) layer where we find neuron activations are highly uncorrelated given the gender. Through Fisher’s Linear Discriminant Analysis (LDA) [8], we show that this high decorrelation makes it safe to discard directly last conv layer neurons with high within-class variance and low between-class variance. Combined with either Support Vector Machines (SVM) or Bayesian classification, the reduced CNNs are capable of achieving comparable (or even higher) accuracies on the LFW and CelebA datasets than the original net with fully connected layers. On LFW, only four Conv5 3 neurons are able to maintain a comparably high recognition accuracy, which results in a reduction of total network size by a factor of 70X with a 11 fold speedup. Comparisons with a state-of-the-art pruning method [12] (as well as two smaller nets [20, 24]) in terms of accuracy loss and convolutional layers pruning rate are also provided.
Deep discriminant analysis for task-dependent compact network search
Qing Tian
James J. Clark
Most of today's popular deep architectures are hand-engineered for general purpose applications. However, this design procedure usually lead… (see more)s to massive redundant, useless, or even harmful features for specific tasks. Such unnecessarily high complexities render deep nets impractical for many real-world applications, especially those without powerful GPU support. In this paper, we attempt to derive task-dependent compact models from a deep discriminant analysis perspective. We propose an iterative and proactive approach for classification tasks which alternates between (1) a pushing step, with an objective to simultaneously maximize class separation, penalize co-variances, and push deep discriminants into alignment with a compact set of neurons, and (2) a pruning step, which discards less useful or even interfering neurons. Deconvolution is adopted to reverse `unimportant' filters' effects and recover useful contributing sources. A simple network growing strategy based on the basic Inception module is proposed for challenging tasks requiring larger capacity than what the base net can offer. Experiments on the MNIST, CIFAR10, and ImageNet datasets demonstrate our approach's efficacy. On ImageNet, by pushing and pruning our grown Inception-88 model, we achieve better-performing models than smaller deep Inception nets grown, residual nets, and famous compact nets at similar sizes. We also show that our grown deep Inception nets (without hard-coded dimension alignment) can beat residual nets of similar complexities.
Preface
Ismail Ben Ayed
Marleen de Bruijne
Maxime Descoteaux