Table of Contents
Fetching ...

Towards Integrating Epistemic Uncertainty Estimation into the Radiotherapy Workflow

Marvin Tom Teichmann, Manasi Datar, Lisa Kratzke, Fernando Vega, Florin C. Ghesu

TL;DR

This work addresses the reliability of DL-based OAR contouring in radiotherapy when faced with out-of-distribution data. It couples epistemic uncertainty estimation via a deep ensemble and Monte Carlo dropout within a ConvNeXt/U-Net-style 3D segmentation framework with a Mahalanobis-distance-based OOD detector learned from clinical data. The study introduces datasets for uncertainty evaluation, details a robust architectural and training protocol, and reports an AUC-ROC of 0.95 with 0.95 specificity and 0.92 sensitivity for implant OOD cases, underscoring the practical value of uncertainty signals as an early warning for expert review. These findings support safer integration of AI into FDA-approved OAR segmentation workflows and pave the way for wider benchmarks and comparative evaluations in uncertainty quantification for medical imaging.

Abstract

The precision of contouring target structures and organs-at-risk (OAR) in radiotherapy planning is crucial for ensuring treatment efficacy and patient safety. Recent advancements in deep learning (DL) have significantly improved OAR contouring performance, yet the reliability of these models, especially in the presence of out-of-distribution (OOD) scenarios, remains a concern in clinical settings. This application study explores the integration of epistemic uncertainty estimation within the OAR contouring workflow to enable OOD detection in clinically relevant scenarios, using specifically compiled data. Furthermore, we introduce an advanced statistical method for OOD detection to enhance the methodological framework of uncertainty estimation. Our empirical evaluation demonstrates that epistemic uncertainty estimation is effective in identifying instances where model predictions are unreliable and may require an expert review. Notably, our approach achieves an AUC-ROC of 0.95 for OOD detection, with a specificity of 0.95 and a sensitivity of 0.92 for implant cases, underscoring its efficacy. This study addresses significant gaps in the current research landscape, such as the lack of ground truth for uncertainty estimation and limited empirical evaluations. Additionally, it provides a clinically relevant application of epistemic uncertainty estimation in an FDA-approved and widely used clinical solution for OAR segmentation from Varian, a Siemens Healthineers company, highlighting its practical benefits.

Towards Integrating Epistemic Uncertainty Estimation into the Radiotherapy Workflow

TL;DR

This work addresses the reliability of DL-based OAR contouring in radiotherapy when faced with out-of-distribution data. It couples epistemic uncertainty estimation via a deep ensemble and Monte Carlo dropout within a ConvNeXt/U-Net-style 3D segmentation framework with a Mahalanobis-distance-based OOD detector learned from clinical data. The study introduces datasets for uncertainty evaluation, details a robust architectural and training protocol, and reports an AUC-ROC of 0.95 with 0.95 specificity and 0.92 sensitivity for implant OOD cases, underscoring the practical value of uncertainty signals as an early warning for expert review. These findings support safer integration of AI into FDA-approved OAR segmentation workflows and pave the way for wider benchmarks and comparative evaluations in uncertainty quantification for medical imaging.

Abstract

The precision of contouring target structures and organs-at-risk (OAR) in radiotherapy planning is crucial for ensuring treatment efficacy and patient safety. Recent advancements in deep learning (DL) have significantly improved OAR contouring performance, yet the reliability of these models, especially in the presence of out-of-distribution (OOD) scenarios, remains a concern in clinical settings. This application study explores the integration of epistemic uncertainty estimation within the OAR contouring workflow to enable OOD detection in clinically relevant scenarios, using specifically compiled data. Furthermore, we introduce an advanced statistical method for OOD detection to enhance the methodological framework of uncertainty estimation. Our empirical evaluation demonstrates that epistemic uncertainty estimation is effective in identifying instances where model predictions are unreliable and may require an expert review. Notably, our approach achieves an AUC-ROC of 0.95 for OOD detection, with a specificity of 0.95 and a sensitivity of 0.92 for implant cases, underscoring its efficacy. This study addresses significant gaps in the current research landscape, such as the lack of ground truth for uncertainty estimation and limited empirical evaluations. Additionally, it provides a clinically relevant application of epistemic uncertainty estimation in an FDA-approved and widely used clinical solution for OAR segmentation from Varian, a Siemens Healthineers company, highlighting its practical benefits.
Paper Structure (20 sections, 4 figures)

This paper contains 20 sections, 4 figures.

Figures (4)

  • Figure 1: Examples of OOD Scenarios in radiotherapy contouring: (a) Femur implants cause image artifacts. (b) Brachytherapy applicator devices distort the anatomy. (c) Hydrogel rectal spacer can significantly alter anatomical contours.
  • Figure 2: Visualization of our uncertainty model for OAR contouring.
  • Figure 3: Visualization of results for OOD datasets with prediction (left), raw maximum uncertainties (center), processed maximum uncertainties (right). Maximum uncertainty is computed across all foreground classes (organs).
  • Figure 4: Statistical analysis: (a) Cumulative class-conditional Mahalanobis distance curves, x-axis is truncated. (b) Scatterplot of Mahalanobis distance scores for each dataset including OOD threshold. (c) ROC curve for OOD detection.