Table of Contents
Fetching ...

Interactive 3D Segmentation for Primary Gross Tumor Volume in Oropharyngeal Cancer

Mikko Saukkoriipi, Jaakko Sahlsten, Joel Jaskari, Lotta Orasmaa, Jari Kangas, Nastaran Rasouli, Roope Raisamo, Jussi Hirvonen, Helena Mehtonen, Jorma Järnstedt, Antti Mäkitie, Mohamed Naser, Clifton Fuller, Benjamin Kann, Kimmo Kaski

TL;DR

This study addresses the challenge of accurate GTVp segmentation in OPC by introducing 2S-ICR, a two-stage interactive deep learning framework that couples a standard volumetric DL model with an interactive refinement model. Trained on the HECKTOR dataset and validated externally on MD Anderson data, 2S-ICR achieves strong non-interactive baselines and superior performance as clinicians interact, with ensemble refinements providing additional gains. The results demonstrate robust 3D interactive segmentation in PET-CT space, reducing manual effort while maintaining high accuracy, and point to broader potential for interactive segmentation in clinical radiotherapy planning. The work highlights the practicality of click-based corrections in 3D volumetric segmentation and discusses limitations and future extensions toward multi-class, real-user studies, and broader modality applicability.

Abstract

The main treatment modality for oropharyngeal cancer (OPC) is radiotherapy, where accurate segmentation of the primary gross tumor volume (GTVp) is essential. However, accurate GTVp segmentation is challenging due to significant interobserver variability and the time-consuming nature of manual annotation, while fully automated methods can occasionally fail. An interactive deep learning (DL) model offers the advantage of automatic high-performance segmentation with the flexibility for user correction when necessary. In this study, we examine interactive DL for GTVp segmentation in OPC. We implement state-of-the-art algorithms and propose a novel two-stage Interactive Click Refinement (2S-ICR) framework. Using the 2021 HEad and neCK TumOR (HECKTOR) dataset for development and an external dataset from The University of Texas MD Anderson Cancer Center for evaluation, the 2S-ICR framework achieves a Dice similarity coefficient of 0.713 $\pm$ 0.152 without user interaction and 0.824 $\pm$ 0.099 after five interactions, outperforming existing methods in both cases.

Interactive 3D Segmentation for Primary Gross Tumor Volume in Oropharyngeal Cancer

TL;DR

This study addresses the challenge of accurate GTVp segmentation in OPC by introducing 2S-ICR, a two-stage interactive deep learning framework that couples a standard volumetric DL model with an interactive refinement model. Trained on the HECKTOR dataset and validated externally on MD Anderson data, 2S-ICR achieves strong non-interactive baselines and superior performance as clinicians interact, with ensemble refinements providing additional gains. The results demonstrate robust 3D interactive segmentation in PET-CT space, reducing manual effort while maintaining high accuracy, and point to broader potential for interactive segmentation in clinical radiotherapy planning. The work highlights the practicality of click-based corrections in 3D volumetric segmentation and discusses limitations and future extensions toward multi-class, real-user studies, and broader modality applicability.

Abstract

The main treatment modality for oropharyngeal cancer (OPC) is radiotherapy, where accurate segmentation of the primary gross tumor volume (GTVp) is essential. However, accurate GTVp segmentation is challenging due to significant interobserver variability and the time-consuming nature of manual annotation, while fully automated methods can occasionally fail. An interactive deep learning (DL) model offers the advantage of automatic high-performance segmentation with the flexibility for user correction when necessary. In this study, we examine interactive DL for GTVp segmentation in OPC. We implement state-of-the-art algorithms and propose a novel two-stage Interactive Click Refinement (2S-ICR) framework. Using the 2021 HEad and neCK TumOR (HECKTOR) dataset for development and an external dataset from The University of Texas MD Anderson Cancer Center for evaluation, the 2S-ICR framework achieves a Dice similarity coefficient of 0.713 0.152 without user interaction and 0.824 0.099 after five interactions, outperforming existing methods in both cases.
Paper Structure (13 sections, 8 equations, 4 figures, 4 tables, 1 algorithm)

This paper contains 13 sections, 8 equations, 4 figures, 4 tables, 1 algorithm.

Figures (4)

  • Figure 1: Model comparison on segmentation performance change in Dice similarity coefficient (DSC) evaluated on the MDA dataset. The statistical significance tests between the models are based on the two-sided Wilcoxon signed rank test with Benjamini–Hochberg procedure to correct for multiple testing, in which p $<$ 0.05 is considered significant.
  • Figure 2: Segmentation performance change in Dice similarity coefficient (DSC) on individual samples from the MDA dataset using the ensemble version of 2S-ICR. The DSC with and without interactions is shown in y-axis and x-axis, respectively.
  • Figure 3: Progressive refinement of segmentation through first three user interactions overlaid on CT (top row) and PET (bottom row) slices. False positives are marked in yellow and clicks with white arrow.
  • Figure 4: Visualisation of 2S-ICR framework. The initial segmentation ($t=0$) is provided by a standard model which is shown in the green box on left. The segmentation refinement ($t\geq1$) loop using a refinement model is visualised in the yellow box on right. Spatial dimensions (H$\times$W$\times$D), thresholding ($>$), negative (Neg), and positive (Pos) feature maps.