Table of Contents
Fetching ...

Registration-Enhanced Segmentation Method for Prostate Cancer in Ultrasound Images

Shengtian Sang, Hassan Jahanandish, Cynthia Xinran Li, Indrani Bhattachary, Jeong Hoon Lee, Lichun Zhang, Sulaiman Vesal, Pejman Ghanouni, Richard Fan, Geoffrey A. Sonn, Mirabela Rusu

TL;DR

The paper tackles the challenge of accurate prostate tumor segmentation on ultrasound by eliminating the reliance on manual MRI annotations through a registration-enhanced MRI-TRUS fusion framework. By integrating affine registration within an end-to-end segmentation pipeline, the approach aligns modality-specific information before segmentation, yielding a Dice coefficient of $0.212$ on the test set and achieving statistically significant gains over TRUS-only and naive fusion baselines ($p < 0.01$). Key contributions include a two-phase training strategy, a patch-based Transformer-driven registration module, and a unified loss that couples registration, feature alignment, and segmentation. The results demonstrate that spatial alignment of multimodal data is crucial for high-fidelity tumor segmentation in multimodal medical imaging and suggest broader applicability to other multimodal segmentation tasks.

Abstract

Prostate cancer is a major cause of cancer-related deaths in men, where early detection greatly improves survival rates. Although MRI-TRUS fusion biopsy offers superior accuracy by combining MRI's detailed visualization with TRUS's real-time guidance, it is a complex and time-intensive procedure that relies heavily on manual annotations, leading to potential errors. To address these challenges, we propose a fully automatic MRI-TRUS fusion-based segmentation method that identifies prostate tumors directly in TRUS images without requiring manual annotations. Unlike traditional multimodal fusion approaches that rely on naive data concatenation, our method integrates a registration-segmentation framework to align and leverage spatial information between MRI and TRUS modalities. This alignment enhances segmentation accuracy and reduces reliance on manual effort. Our approach was validated on a dataset of 1,747 patients from Stanford Hospital, achieving an average Dice coefficient of 0.212, outperforming TRUS-only (0.117) and naive MRI-TRUS fusion (0.132) methods, with significant improvements (p $<$ 0.01). This framework demonstrates the potential for reducing the complexity of prostate cancer diagnosis and provides a flexible architecture applicable to other multimodal medical imaging tasks.

Registration-Enhanced Segmentation Method for Prostate Cancer in Ultrasound Images

TL;DR

The paper tackles the challenge of accurate prostate tumor segmentation on ultrasound by eliminating the reliance on manual MRI annotations through a registration-enhanced MRI-TRUS fusion framework. By integrating affine registration within an end-to-end segmentation pipeline, the approach aligns modality-specific information before segmentation, yielding a Dice coefficient of on the test set and achieving statistically significant gains over TRUS-only and naive fusion baselines (). Key contributions include a two-phase training strategy, a patch-based Transformer-driven registration module, and a unified loss that couples registration, feature alignment, and segmentation. The results demonstrate that spatial alignment of multimodal data is crucial for high-fidelity tumor segmentation in multimodal medical imaging and suggest broader applicability to other multimodal segmentation tasks.

Abstract

Prostate cancer is a major cause of cancer-related deaths in men, where early detection greatly improves survival rates. Although MRI-TRUS fusion biopsy offers superior accuracy by combining MRI's detailed visualization with TRUS's real-time guidance, it is a complex and time-intensive procedure that relies heavily on manual annotations, leading to potential errors. To address these challenges, we propose a fully automatic MRI-TRUS fusion-based segmentation method that identifies prostate tumors directly in TRUS images without requiring manual annotations. Unlike traditional multimodal fusion approaches that rely on naive data concatenation, our method integrates a registration-segmentation framework to align and leverage spatial information between MRI and TRUS modalities. This alignment enhances segmentation accuracy and reduces reliance on manual effort. Our approach was validated on a dataset of 1,747 patients from Stanford Hospital, achieving an average Dice coefficient of 0.212, outperforming TRUS-only (0.117) and naive MRI-TRUS fusion (0.132) methods, with significant improvements (p 0.01). This framework demonstrates the potential for reducing the complexity of prostate cancer diagnosis and provides a flexible architecture applicable to other multimodal medical imaging tasks.

Paper Structure

This paper contains 16 sections, 14 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: (a) Illustration of the MRI-TRUS fusion-based biopsy process. The patient first undergoes an MRI scan, during which a physician annotates the suspicious tumor region and the prostate gland on the MRI image. During the biopsy procedure, the tumor region identified on the MRI is mapped onto the ultrasound image to guide the needle placement for sampling. (b) Illustration of the fused MRI and ultrasound images. The top part of the figure shows the MRI and ultrasound images placed in the same spatial coordinate system, highlighting the initial misalignment of prostate information between the two modalities. The bottom part displays the MRI-ultrasound fusion after alignment using our method, showing improved alignment of prostate information across both imaging modalities.
  • Figure 2: This figure illustrates the overall architecture of the proposed method. The numbered components represent key steps in the framework, with steps 4 and 9 corresponding to the registration and segmentation processes. The architecture incorporates positional embeddings (Step 1) and feature extraction (Steps 2 and 3) to process input images. Affine registration (Step 4) is performed to align the data, generating a transform matrix that guides the segmentation process (Step 9). Arrows indicate data flows, while element-wise addition for feature integration is denoted by the "+" symbol. Upsampling (Step 7) and downsampling (Step 2) further refine the outputs.
  • Figure 3: The architecture of the registration method adopted in our approach. The input TRUS and MRI features are initially divided into equal-sized patches through a patch-splitting operation, and a patch-merging step then combines and reshapes the features. The transformed feature vectors pass through N consecutive Transformer Encoder layers, which consist of self-attention, feed-forward, and normalization blocks. Finally, the Multi-Layer Perceptron (MLP) processes the output vectors to generate the Transform Matrix, which enables affine transformations.
  • Figure 4: Effect of TRUS-MRI Alignment on Prostate and Tumor Segmentation Performance.The left panel illustrates the spatial positions of the prostate and tumor in the initial TRUS and MRI inputs, along with an intermediate output from our registration method. The right panel depicts the relationship between the alignment quality of the input data (TRUS and MRI) and the final model performance. The red points represent intermediate outputs from our registration approach. As shown, our method significantly aligns the TRUS and MRI data compared to the initial input, resulting in an improvement in model performance.
  • Figure 5: Visualization of registration and segmentation results across different training epochs. The top row shows the registration results, with the Dice score for registration performance displayed below each 3D model. The bottom row illustrates the segmentation results, highlighting the segmented tumor regions in color and the corresponding Dice score for segmentation performance. The results are shown at selected epochs to demonstrate the progressive improvement of both registration and segmentation during training. The comparison highlights the relationship between training progression and model performance for both tasks.
  • ...and 2 more figures