Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling

Iman Islam; Esther Puyol-Antón; Bram Ruijsink; Andrew J. Reader; Andrew P. King

Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling

Iman Islam, Esther Puyol-Antón, Bram Ruijsink, Andrew J. Reader, Andrew P. King

TL;DR

The paper tackles the challenge of robust echocardiography segmentation when training data are multi-source and only partially labelled. It shows that existing adaptive losses can suffer from shortcut learning due to domain shift, and introduces label dropout to decouple domain features from label presence. Through multi-dataset experiments, label dropout yields substantial Dice improvements for LV, LVM, and LA, approaching benchmark performance despite partial labelling. The findings highlight a practical strategy for deploying generalisable echo segmentation tools across scanners and operators, with potential extension to other modalities and architectures.

Abstract

Echocardiography (echo) is the first imaging modality used when assessing cardiac function. The measurement of functional biomarkers from echo relies upon the segmentation of cardiac structures and deep learning models have been proposed to automate the segmentation process. However, in order to translate these tools to widespread clinical use it is important that the segmentation models are robust to a wide variety of images (e.g. acquired from different scanners, by operators with different levels of expertise etc.). To achieve this level of robustness it is necessary that the models are trained with multiple diverse datasets. A significant challenge faced when training with multiple diverse datasets is the variation in label presence, i.e. the combined data are often partially-labelled. Adaptations of the cross entropy loss function have been proposed to deal with partially labelled data. In this paper we show that training naively with such a loss function and multiple diverse datasets can lead to a form of shortcut learning, where the model associates label presence with domain characteristics, leading to a drop in performance. To address this problem, we propose a novel label dropout scheme to break the link between domain characteristics and the presence or absence of labels. We demonstrate that label dropout improves echo segmentation Dice score by 62% and 25% on two cardiac structures when training using multiple diverse partially labelled datasets.

Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling

TL;DR

Abstract

Paper Structure (12 sections, 5 figures, 2 tables)

This paper contains 12 sections, 5 figures, 2 tables.

Introduction
Materials and Methods
Datasets:
Baseline segmentation models:
Label dropout:
Experiments and Results
Experiment 1 - The need for training with multiple diverse datasets:
Experiment 2 - Training using a combination of three diverse partially labelled echo datasets:
Experiment 3 - Investigating the adaptive loss in a controlled experiment:
Experiment 4 - Label dropout:
Discussion and Conclusion
Acknowledgments.

Figures (5)

Figure 1: Experiment 1: Test Dice scores achieved by training and evaluating intra-domain and cross-domain LV segmentation models using three different datasets. C = CAMUS leclerc_deep_2019, UI = Unity Imaging huang_fix--step_2023, END = EchoNet Dynamic ouyang_video-based_2020.
Figure 2: Experiment 2: Example test results from the three datasets. From left to right: image, ground truth segmentation and model predictions using standard loss model, adaptive loss without augmentation and adaptive loss with augmentation.
Figure 3: Experiment 3: Test set results when training using only the CAMUS dataset with 50% of LVM labels removed. Box plots show Dice coefficients for each segmented structure and the overall mean. Green = benchmark, blue = standard, pink = adaptive.
Figure 4: Experiment 4: Label Dropout. (i) Test set Dice scores from models trained with different probabilities of label dropout on the LA for images from the Unity Imaging dataset. Models were trained three times with different random seeds and the error bars show the mean and standard deviation of the results. Benchmark was trained with a standard loss using fully labelled data. (ii) Sample results on the Unity Imaging dataset. From left to right: image, ground truth segmentation and model predictions without label dropout and with 50% label dropout.
Figure 5: Repetition of Experiment 2 with label dropout. Randomly selected test set results when training with all 3 datasets using label dropout (LD). From left to right: image, ground truth segmentation and model predictions using adaptive loss with augmentation and adaptive loss with augmentation and label dropout.

Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling

TL;DR

Abstract

Label Dropout: Improved Deep Learning Echocardiography Segmentation Using Multiple Datasets With Domain Shift and Partial Labelling

Authors

TL;DR

Abstract

Table of Contents

Figures (5)