Deep Learning-Based Segmentation of Tumors in PET/CT Volumes: Benchmark of Different Architectures and Training Strategies
Monika Górka, Daniel Jaworek, Marek Wodzinski
TL;DR
This study benchmarks multiple deep learning architectures for segmenting cancer lesions in PET/CT across head/neck and whole‑body images, focusing on one‑step and two‑step segmentation strategies. Using AutoPET and HECKTOR datasets, nnU‑Net and U‑Net/V‑Net emerge as strong performers, while UNETR shows limited gains likely due to data size, underscoring the importance of data preparation and training strategy. A key finding is that training on cancer‑positive data and employing a two‑step approach can substantially improve segmentation metrics, highlighting the practical value of targeted data curation. Overall, the work demonstrates the potential of AI to support oncological diagnostics, while also pointing to the need for larger, more diverse datasets and pretraining to fully exploit transformer architectures.
Abstract
Cancer is one of the leading causes of death globally, and early diagnosis is crucial for patient survival. Deep learning algorithms have great potential for automatic cancer analysis. Artificial intelligence has achieved high performance in recognizing and segmenting single lesions. However, diagnosing multiple lesions remains a challenge. This study examines and compares various neural network architectures and training strategies for automatically segmentation of cancer lesions using PET/CT images from the head, neck, and whole body. The authors analyzed datasets from the AutoPET and HECKTOR challenges, exploring popular single-step segmentation architectures and presenting a two-step approach. The results indicate that the V-Net and nnU-Net models were the most effective for their respective datasets. The results for the HECKTOR dataset ranged from 0.75 to 0.76 for the aggregated Dice coefficient. Eliminating cancer-free cases from the AutoPET dataset was found to improve the performance of most models. In the case of AutoPET data, the average segmentation efficiency after training only on images containing cancer lesions increased from 0.55 to 0.66 for the classic Dice coefficient and from 0.65 to 0.73 for the aggregated Dice coefficient. The research demonstrates the potential of artificial intelligence in precise oncological diagnostics and may contribute to the development of more targeted and effective cancer assessment techniques.
