Registration-Enhanced Segmentation Method for Prostate Cancer in Ultrasound Images
Shengtian Sang, Hassan Jahanandish, Cynthia Xinran Li, Indrani Bhattachary, Jeong Hoon Lee, Lichun Zhang, Sulaiman Vesal, Pejman Ghanouni, Richard Fan, Geoffrey A. Sonn, Mirabela Rusu
TL;DR
The paper tackles the challenge of accurate prostate tumor segmentation on ultrasound by eliminating the reliance on manual MRI annotations through a registration-enhanced MRI-TRUS fusion framework. By integrating affine registration within an end-to-end segmentation pipeline, the approach aligns modality-specific information before segmentation, yielding a Dice coefficient of $0.212$ on the test set and achieving statistically significant gains over TRUS-only and naive fusion baselines ($p < 0.01$). Key contributions include a two-phase training strategy, a patch-based Transformer-driven registration module, and a unified loss that couples registration, feature alignment, and segmentation. The results demonstrate that spatial alignment of multimodal data is crucial for high-fidelity tumor segmentation in multimodal medical imaging and suggest broader applicability to other multimodal segmentation tasks.
Abstract
Prostate cancer is a major cause of cancer-related deaths in men, where early detection greatly improves survival rates. Although MRI-TRUS fusion biopsy offers superior accuracy by combining MRI's detailed visualization with TRUS's real-time guidance, it is a complex and time-intensive procedure that relies heavily on manual annotations, leading to potential errors. To address these challenges, we propose a fully automatic MRI-TRUS fusion-based segmentation method that identifies prostate tumors directly in TRUS images without requiring manual annotations. Unlike traditional multimodal fusion approaches that rely on naive data concatenation, our method integrates a registration-segmentation framework to align and leverage spatial information between MRI and TRUS modalities. This alignment enhances segmentation accuracy and reduces reliance on manual effort. Our approach was validated on a dataset of 1,747 patients from Stanford Hospital, achieving an average Dice coefficient of 0.212, outperforming TRUS-only (0.117) and naive MRI-TRUS fusion (0.132) methods, with significant improvements (p $<$ 0.01). This framework demonstrates the potential for reducing the complexity of prostate cancer diagnosis and provides a flexible architecture applicable to other multimodal medical imaging tasks.
