Interactive 3D Segmentation for Primary Gross Tumor Volume in Oropharyngeal Cancer
Mikko Saukkoriipi, Jaakko Sahlsten, Joel Jaskari, Lotta Orasmaa, Jari Kangas, Nastaran Rasouli, Roope Raisamo, Jussi Hirvonen, Helena Mehtonen, Jorma Järnstedt, Antti Mäkitie, Mohamed Naser, Clifton Fuller, Benjamin Kann, Kimmo Kaski
TL;DR
This study addresses the challenge of accurate GTVp segmentation in OPC by introducing 2S-ICR, a two-stage interactive deep learning framework that couples a standard volumetric DL model with an interactive refinement model. Trained on the HECKTOR dataset and validated externally on MD Anderson data, 2S-ICR achieves strong non-interactive baselines and superior performance as clinicians interact, with ensemble refinements providing additional gains. The results demonstrate robust 3D interactive segmentation in PET-CT space, reducing manual effort while maintaining high accuracy, and point to broader potential for interactive segmentation in clinical radiotherapy planning. The work highlights the practicality of click-based corrections in 3D volumetric segmentation and discusses limitations and future extensions toward multi-class, real-user studies, and broader modality applicability.
Abstract
The main treatment modality for oropharyngeal cancer (OPC) is radiotherapy, where accurate segmentation of the primary gross tumor volume (GTVp) is essential. However, accurate GTVp segmentation is challenging due to significant interobserver variability and the time-consuming nature of manual annotation, while fully automated methods can occasionally fail. An interactive deep learning (DL) model offers the advantage of automatic high-performance segmentation with the flexibility for user correction when necessary. In this study, we examine interactive DL for GTVp segmentation in OPC. We implement state-of-the-art algorithms and propose a novel two-stage Interactive Click Refinement (2S-ICR) framework. Using the 2021 HEad and neCK TumOR (HECKTOR) dataset for development and an external dataset from The University of Texas MD Anderson Cancer Center for evaluation, the 2S-ICR framework achieves a Dice similarity coefficient of 0.713 $\pm$ 0.152 without user interaction and 0.824 $\pm$ 0.099 after five interactions, outperforming existing methods in both cases.
