Table of Contents
Fetching ...

Location-based Radiology Report-Guided Semi-supervised Learning for Prostate Cancer Detection

Alex Chen, Nathan Lay, Stephanie Harmon, Kutsev Ozyoruk, Enis Yilmaz, Brad J. Wood, Peter A. Pinto, Peter L. Choyke, Baris Turkbey

TL;DR

This work tackles the annotation bottleneck in MRI-based prostate cancer detection by introducing a lesion location-guided semi-supervised learning framework that leverages radiology report information to refine pseudo labels on unlabeled data. A teacher–student architecture with nnU-Net segmentation uses report-derived lesion locations to correct pseudo labels, outperforming supervised and lesion-count SSL methods, particularly when manual annotations are scarce. The approach maintains competitive segmentation accuracy while reducing false positives, demonstrating a practical path to scale prostate cancer detection models with larger unlabeled datasets. Limitations include reliance on structured reports and PI-RADS-based localization, suggesting future integration with unstructured-report processing and extension to other organ systems.

Abstract

Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locations in radiology reports, allowing for use of unannotated images to reduce the annotation burden. By leveraging lesion locations, we refined pseudo labels, which were then used to train our location-based SSL model. We show that our SSL method can improve prostate lesion detection by utilizing unannotated images, with more substantial impacts being observed when larger proportions of unannotated images are used.

Location-based Radiology Report-Guided Semi-supervised Learning for Prostate Cancer Detection

TL;DR

This work tackles the annotation bottleneck in MRI-based prostate cancer detection by introducing a lesion location-guided semi-supervised learning framework that leverages radiology report information to refine pseudo labels on unlabeled data. A teacher–student architecture with nnU-Net segmentation uses report-derived lesion locations to correct pseudo labels, outperforming supervised and lesion-count SSL methods, particularly when manual annotations are scarce. The approach maintains competitive segmentation accuracy while reducing false positives, demonstrating a practical path to scale prostate cancer detection models with larger unlabeled datasets. Limitations include reliance on structured reports and PI-RADS-based localization, suggesting future integration with unstructured-report processing and extension to other organ systems.

Abstract

Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locations in radiology reports, allowing for use of unannotated images to reduce the annotation burden. By leveraging lesion locations, we refined pseudo labels, which were then used to train our location-based SSL model. We show that our SSL method can improve prostate lesion detection by utilizing unannotated images, with more substantial impacts being observed when larger proportions of unannotated images are used.
Paper Structure (9 sections, 4 figures, 3 tables)

This paper contains 9 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: A workflow of the semisupervised learning method with lesion location-based report-guidance for prostate cancer detection. bpMRI = biparametric MRI, T2W = T2-weighted, ADC = apparent diffusion coefficient, HBV = high-b value.
  • Figure 2: Example case for when it is necessary to use lesion location-based report guidance to generate pseudo labels.
  • Figure 3: Free-response receiver operating characteristic ROC (FROC) curve of the supervised model trained on 300 annotated cases compared to semi-supervised models trained on 300 annotated cases and 1634 unlabled cases and evaluated on 445 annotated test cases.
  • Figure 4: Example patient from the test set with segmentations from the Count-based SSL (DSC=0.77) and Location-based SSL (DSC=0.82) trained on 500 manually labeled cases and 1434 unlabeled cases. T2W = T2-weighted, ADC = apparent diffusion coefficient, HBV = high-b value.