Justin Szeto

Mingyang Li

Hengguan Huang

2025-06-01

arXiv (published)

Mitigating Calibration Bias Without Fixed Attribute Grouping for Improved Fairness in Medical Imaging Analysis

Changjian Shui

Raghav Mehta

Douglas Arnold

2023-10-08

OpenReview.net/Archive (published)

openreview.net

Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation

Brennan Nichyporuk

Jillian L. Cardinell

Raghav Mehta

Jean-Pierre R. Falet

Douglas Arnold

Sotirios A. Tsaftaris

Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, wh… (see more)ere unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the "ground-truth" label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.

2022-10-31

ArXiv (preprint)

Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation

Brennan Nichyporuk

Jillian L. Cardinell

Raghav Mehta

Sotirios A. Tsaftaris

Douglas Arnold

2021-09-21

Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health (published)

Optimizing Operating Points for High Performance Lesion Detection and Segmentation Using Lesion Size Reweighting

Brennan Nichyporuk

Douglas Arnold

There are many clinical contexts which require accurate detection and segmentation of all focal pathologies (e.g. lesions, tumours) in patie… (see more)nt images. In cases where there are a mix of small and large lesions, standard binary cross entropy loss will result in better segmentation of large lesions at the expense of missing small ones. Adjusting the operating point to accurately detect all lesions generally leads to oversegmentation of large lesions. In this work, we propose a novel reweighing strategy to eliminate this performance gap, increasing small pathology detection performance while maintaining segmentation accuracy. We show that our reweighing strategy vastly outperforms competing strategies based on experiments on a large scale, multi-scanner, multi-center dataset of Multiple Sclerosis patient images.

2021-05-11

MIDL.io/2021/Conference/Short (poster)

openreview.net

Accounting for Variance in Machine Learning Benchmarks

Naz Sepah

Edward Raff

Kanika Madan

Vikram Voleti

Samira Ebrahimi Kahou

Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the l… (see more)earning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, revealing that variance due to data sampling, parameter initialization and hyperparameter choice impact markedly the results. We analyze the predominant comparison methods used today in the light of this variance. We show a counter-intuitive result that adding more sources of variation to an imperfect estimator approaches better the ideal estimator at a 51 times reduction in compute cost. Building on these results, we study the error rate of detecting improvements, on five different deep-learning tasks/architectures. This study leads us to propose recommendations for performance comparisons.

2021-01-01

MLSys (published)