Table of Contents
Fetching ...

Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer

Hong Wu, Juan Fu, Hongsheng Ye, Yuming Zhong, Xuebin Zhou, Jianhua Zhou, Yi Wang

TL;DR

This paper addresses csPCa identification from transrectal ultrasound videos by leveraging multi-modality data (B-mode and SWE) within a dual-stream architecture. It introduces an adaptive spatial fusion module to fuse voxel-level modality information and an orthogonal regularization term to reduce feature redundancy, with Grad-CAM visualizations to guide biopsy. On an in-house dataset of 512 TRUS videos, the method achieves an AUC of $0.84$ (F1 $0.86$, Acc $0.79$), outperforming single-modality baselines and prior work lacking prostate masks. The approach offers practical potential for TRUS-guided targeted biopsy and provides a publicly available implementation for reproducibility.

Abstract

Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The framework utilizes two 3D ResNet-50 models to extract features from B-mode images and shear wave elastography images, respectively. An adaptive spatial fusion module is introduced to aggregate two modalities' features. An orthogonal regularized loss is further used to mitigate feature redundancy. The proposed framework is evaluated on an in-house dataset containing 512 TRUS videos, and achieves favorable performance in identifying csPCa with an area under curve (AUC) of 0.84. Furthermore, the visualized class activation mapping (CAM) images generated from the proposed framework may provide valuable guidance for the localization of csPCa, thus facilitating the TRUS-guided targeted biopsy. Our code is publicly available at https://github.com/2313595986/ProstateTRUS.

Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer

TL;DR

This paper addresses csPCa identification from transrectal ultrasound videos by leveraging multi-modality data (B-mode and SWE) within a dual-stream architecture. It introduces an adaptive spatial fusion module to fuse voxel-level modality information and an orthogonal regularization term to reduce feature redundancy, with Grad-CAM visualizations to guide biopsy. On an in-house dataset of 512 TRUS videos, the method achieves an AUC of (F1 , Acc ), outperforming single-modality baselines and prior work lacking prostate masks. The approach offers practical potential for TRUS-guided targeted biopsy and provides a publicly available implementation for reproducibility.

Abstract

Prostate cancer is the most common noncutaneous cancer in the world. Recently, multi-modality transrectal ultrasound (TRUS) has increasingly become an effective tool for the guidance of prostate biopsies. With the aim of effectively identifying prostate cancer, we propose a framework for the classification of clinically significant prostate cancer (csPCa) from multi-modality TRUS videos. The framework utilizes two 3D ResNet-50 models to extract features from B-mode images and shear wave elastography images, respectively. An adaptive spatial fusion module is introduced to aggregate two modalities' features. An orthogonal regularized loss is further used to mitigate feature redundancy. The proposed framework is evaluated on an in-house dataset containing 512 TRUS videos, and achieves favorable performance in identifying csPCa with an area under curve (AUC) of 0.84. Furthermore, the visualized class activation mapping (CAM) images generated from the proposed framework may provide valuable guidance for the localization of csPCa, thus facilitating the TRUS-guided targeted biopsy. Our code is publicly available at https://github.com/2313595986/ProstateTRUS.
Paper Structure (11 sections, 4 equations, 4 figures, 1 table)

This paper contains 11 sections, 4 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: A frame extracted from the transrectal ultrasound (TRUS) video, displaying both B-mode and shear wave elastography (SWE) images. The B-mode provides anatomical details, while the SWE depicts tissue stiffness.
  • Figure 2: A schematic overview of our proposed framework for csPCa classification in TRUS videos. The classification framework consists of two 3D ResNet-50 models to extract features from B-mode images and SWE images, respectively. The adaptive spatial fusion module is designed to aggregate two modalities' features. The orthogonal regularization encourages weights to be orthogonal, thereby mitigating feature redundancy.
  • Figure 3: The receiver operating characteristic (ROC) curves of different methods on the testing set.
  • Figure 4: Six examples to show the TRUS frames and their corresponding heat maps generated using the class activation mapping (CAM). The regions highlighted in the heat maps contribute to the network's prediction of csPCa, and therefore may indicate the lesion areas associated with csPCa.