EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation

Kudaibergen Abutalip; Numan Saeed; Ikboljon Sobirov; Vincent Andrearczyk; Adrien Depeursinge; Mohammad Yaqub

EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation

Kudaibergen Abutalip, Numan Saeed, Ikboljon Sobirov, Vincent Andrearczyk, Adrien Depeursinge, Mohammad Yaqub

TL;DR

The paper tackles trustworthy uncertainty estimation in medical image segmentation by aligning model uncertainty with inter-expert disagreements using multi-annotator data. It introduces EDUE, an expert disagreement-guided, single-pass UE method built on a U‑Net-style architecture with a disagreement guidance module and a random-annotation training strategy, producing calibrated uncertainty heatmaps. Through experiments on the RIGA and HECKTOR datasets, EDUE achieves better correlation with expert opinions at image and pixel levels and lower NLL, while maintaining competitive Dice and requiring fewer parameters than deep ensembles. The approach offers practical benefits for calibration, segmentation quality control, and out-of-distribution detection in clinical settings.

Abstract

Deploying deep learning (DL) models in medical applications relies on predictive performance and other critical factors, such as conveying trustworthy predictive uncertainty. Uncertainty estimation (UE) methods provide potential solutions for evaluating prediction reliability and improving the model confidence calibration. Despite increasing interest in UE, challenges persist, such as the need for explicit methods to capture aleatoric uncertainty and align uncertainty estimates with real-life disagreements among domain experts. This paper proposes an Expert Disagreement-Guided Uncertainty Estimation (EDUE) for medical image segmentation. By leveraging variability in ground-truth annotations from multiple raters, we guide the model during training and incorporate random sampling-based strategies to enhance calibration confidence. Our method achieves 55% and 23% improvement in correlation on average with expert disagreements at the image and pixel levels, respectively, better calibration, and competitive segmentation performance compared to the state-of-the-art deep ensembles, requiring only a single forward pass.

EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation

TL;DR

Abstract

Paper Structure (8 sections, 1 equation, 4 figures, 2 tables)

This paper contains 8 sections, 1 equation, 4 figures, 2 tables.

Introduction
Methodology
Experimental Details
Results and Discussion
Correlation Analysis and Segmentation Performance
Segmentation Quality Control
Out-of-Distribution Detection
Conclusion

Figures (4)

Figure 1: EDUE follows a U-Net-like architecture. The disagreement guidance module captures uncertainty by comparing variance heatmaps from the model and labels. Prediction optimization with a random sampling strategy is used for segmentation outputs.
Figure 2: Segmentation quality control results for LE, proposed method, and DE. Corresponding dashed lines are ideal lines. d-AUC: the difference between the area under the main curve and the ideal line (lower is better). (a) Cup results (b) GTVn results.
Figure 3: Out-of-distribution detection results. The box plots show the distribution of layer agreements and model agreements for LE, EDUE, and DE, respectively, at 0$\%$, 50$\%$, and 100$\%$ of distorted images. (a) Disc results (b) Cup results
Figure 4: Sample (cup) from the RIGA dataset. Top row: Input image with contours of all masks, ground-truth variance heatmap, variance heatmaps from EDUE, LE, DE. Bottom row: Input image with soft majority voting mask's contour (threshold 0.5), corresponding ground-truth mask, predicted masks from EDUE, LE, DE. SV: sum of variances.

EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation

TL;DR

Abstract

EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)