Andréanne Lemay

Label fusion and training methods for reliable representation of inter-rater uncertainty

Charley Gros

Enamundram Naga Karthik

Medical tasks are prone to inter-rater variability due to multiple factors such as image quality, professional experience and training, or g… (see more)uideline clarity. Training deep learning networks with annotations from multiple raters is a common practice that mitigates the model's bias towards a single expert. Reliable models generating calibrated outputs and reflecting the inter-rater disagreement are key to the integration of artificial intelligence in clinical practice. Various methods exist to take into account different expert labels. We focus on comparing three label fusion methods: STAPLE, average of the rater's segmentation, and random sampling of each rater's segmentation during training. Each label fusion method is studied using both the conventional training framework and the recently published SoftSeg framework that limits information loss by treating the segmentation task as a regression. Our results, across 10 data splittings on two public datasets, indicate that SoftSeg models, regardless of the ground truth fusion method, had better calibration and preservation of the inter-rater rater variability compared with their conventional counterparts without impacting the segmentation performance. Conventional models, i.e., trained with a Dice loss, with binary inputs, and sigmoid/softmax final activate, were overconfident and underestimated the uncertainty associated with inter-rater variability. Conversely, fusing labels by averaging with the SoftSeg framework led to underconfident outputs and overestimation of the rater disagreement. In terms of segmentation performance, the best label fusion method was different for the two datasets studied, indicating this parameter might be task-dependent. However, SoftSeg had segmentation performance systematically superior or equal to the conventionally trained models and had the best calibration and preservation of the inter-rater variability.

2023-01-17

Machine Learning for Biomedical Imaging (published)

doi.org

arxiv.org

Team NeuroPoly: Description of the Pipelines for the MICCAI 2021 MS New Lesions Segmentation Challenge

Uzay Macar

Enamundram Naga Karthik

Charley Gros

Andréanne Lemay

Julien Cohen-Adad

This paper gives a detailed description of the pipelines used for the 2nd edition of the MICCAI 2021 Challenge on Multiple Sclerosis Lesion … (see more)Segmentation. An overview of the data preprocessing steps applied is provided along with a brief description of the pipelines used, in terms of the architecture and the hyperparameters. Our code for this work can be found at: https://github.com/ivadomed/ms-challenge-2021.

2021-09-11

ArXiv (preprint)

arxiv.org

Automatic multiclass intramedullary spinal cord tumor segmentation on MRI with deep learning

Andréanne Lemay

Charley Gros

Zhizheng Zhuo

Jie Zhang

Yunyun Duan

Julien Cohen-Adad

Yaou Liu

2021-07-21

NeuroImage : Clinical (published)

doi.org

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Andréanne Lemay

Publications

Mila Techaide 2026

Venture Scientist Bootcamp

AI Advantage: Productivity in Public Service

Popular keywords:

Andréanne Lemay

Publications