Table of Contents
Fetching ...

Estimation and Analysis of Slice Propagation Uncertainty in 3D Anatomy Segmentation

Rachaell Nihalaani, Tushar Kataria, Jadie Adams, Shireen Y. Elhabian

TL;DR

The paper addresses the challenge of 3D anatomy segmentation under limited annotations by employing self-supervised slice propagation and integrating calibrated epistemic uncertainty quantification (UQ) to assess reliability. It adapts and benchmarks five UQ methods across two slice-propagation models, Sli2Vol and Vol2Flow, on three abdominal datasets to evaluate both segmentation accuracy and uncertainty calibration. Key findings show that UQ can improve both accuracy and trustworthiness, with concrete dropout delivering strong segmentation and uncertainty estimates, while SWAG offers better calibration at some cost to accuracy, highlighting trade-offs between methods. The work provides open-source code and a comprehensive benchmark, underscoring the practical value of calibrated UQ for safe, annotation-efficient medical image segmentation and outlining avenues for future domain-aware UQ enhancements.

Abstract

Supervised methods for 3D anatomy segmentation demonstrate superior performance but are often limited by the availability of annotated data. This limitation has led to a growing interest in self-supervised approaches in tandem with the abundance of available un-annotated data. Slice propagation has emerged as an self-supervised approach that leverages slice registration as a self-supervised task to achieve full anatomy segmentation with minimal supervision. This approach significantly reduces the need for domain expertise, time, and the cost associated with building fully annotated datasets required for training segmentation networks. However, this shift toward reduced supervision via deterministic networks raises concerns about the trustworthiness and reliability of predictions, especially when compared with more accurate supervised approaches. To address this concern, we propose the integration of calibrated uncertainty quantification (UQ) into slice propagation methods, providing insights into the model's predictive reliability and confidence levels. Incorporating uncertainty measures enhances user confidence in self-supervised approaches, thereby improving their practical applicability. We conducted experiments on three datasets for 3D abdominal segmentation using five UQ methods. The results illustrate that incorporating UQ improves not only model trustworthiness, but also segmentation accuracy. Furthermore, our analysis reveals various failure modes of slice propagation methods that might not be immediately apparent to end-users. This study opens up new research avenues to improve the accuracy and trustworthiness of slice propagation methods.

Estimation and Analysis of Slice Propagation Uncertainty in 3D Anatomy Segmentation

TL;DR

The paper addresses the challenge of 3D anatomy segmentation under limited annotations by employing self-supervised slice propagation and integrating calibrated epistemic uncertainty quantification (UQ) to assess reliability. It adapts and benchmarks five UQ methods across two slice-propagation models, Sli2Vol and Vol2Flow, on three abdominal datasets to evaluate both segmentation accuracy and uncertainty calibration. Key findings show that UQ can improve both accuracy and trustworthiness, with concrete dropout delivering strong segmentation and uncertainty estimates, while SWAG offers better calibration at some cost to accuracy, highlighting trade-offs between methods. The work provides open-source code and a comprehensive benchmark, underscoring the practical value of calibrated UQ for safe, annotation-efficient medical image segmentation and outlining avenues for future domain-aware UQ enhancements.

Abstract

Supervised methods for 3D anatomy segmentation demonstrate superior performance but are often limited by the availability of annotated data. This limitation has led to a growing interest in self-supervised approaches in tandem with the abundance of available un-annotated data. Slice propagation has emerged as an self-supervised approach that leverages slice registration as a self-supervised task to achieve full anatomy segmentation with minimal supervision. This approach significantly reduces the need for domain expertise, time, and the cost associated with building fully annotated datasets required for training segmentation networks. However, this shift toward reduced supervision via deterministic networks raises concerns about the trustworthiness and reliability of predictions, especially when compared with more accurate supervised approaches. To address this concern, we propose the integration of calibrated uncertainty quantification (UQ) into slice propagation methods, providing insights into the model's predictive reliability and confidence levels. Incorporating uncertainty measures enhances user confidence in self-supervised approaches, thereby improving their practical applicability. We conducted experiments on three datasets for 3D abdominal segmentation using five UQ methods. The results illustrate that incorporating UQ improves not only model trustworthiness, but also segmentation accuracy. Furthermore, our analysis reveals various failure modes of slice propagation methods that might not be immediately apparent to end-users. This study opens up new research avenues to improve the accuracy and trustworthiness of slice propagation methods.
Paper Structure (8 sections, 4 figures, 1 table)

This paper contains 8 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Sli2Vol accuracy and uncertainty variation as a function of distance from annotated slice A. Comparative analysis of variability in DSC and uncertainty metrics relative to the distance from the annotated slice when using concrete dropout (dataset: DecathSpleen). Performance metrics (B) DSC,(C) surface dice and (D) uncertainty for all UQ Methods, relative to the distance from the annotated slice.
  • Figure 2: Vol2Flow accuracy and uncertainty variation as function of distance from annotated slice A. Comparative analysis of variability in DSC and uncertainty metrics relative to the distance from the annotated slice when using MC dropout (dataset: CHAOS). Performance metrics (B) DSC,(C) surface dice and (D) uncertainty for all UQ Methods, relative to the distance from the annotated slice.
  • Figure 3: Sli2Vol accuracy and uncertainty variation as function of distance from annotated slice A. Comparative analysis of variability in DSC and uncertainty metrics relative to the distance from the annotated slice when using concrete dropout (dataset: SLiver07). Performance metrics (B) DSC,(C) surface dice and (D) uncertainty for all UQ Methods, relative to the distance from the annotated slice.
  • Figure 4: Comparative analysis of variability in DSC and uncertainty metrics relative to the distance from the annotated slice This figure presents a comparison of DSC variability and uncertainty metrics across each slice propagation method, dataset, and UQ method. A consistent trend is observed across all categories. Supplementary GIFs are provided to visually demonstrate the progression of predicted segmentations and associated uncertainties through a volume.