Table of Contents
Fetching ...

NeoNet: An End-to-End 3D MRI-Based Deep Learning Framework for Non-Invasive Prediction of Perineural Invasion via Generation-Driven Classification

Youngung Han, Minkyung Cha, Kyeonghun Kim, Induk Um, Myeongbin Sho, Joo Young Bae, Jaewon Jung, Jung Hyeok Park, Seojun Lee, Nam-Joon Kim, Woo Kyoung Jeong, Won Jae Lee, Pa Hong, Ken Ying-Kai Liao, Hyuk-Jae Lee

Abstract

Minimizing invasive diagnostic procedures to reduce the risk of patient injury and infection is a central goal in medical imaging. And yet, noninvasive diagnosis of perineural invasion (PNI), a critical prognostic factor involving infiltration of tumor cells along the surrounding nerve, still remains challenging, due to the lack of clear and consistent imaging criteria criteria for identifying PNI. To address this challenge, we present NeoNet, an integrated end-to-end 3D deep learning framework for PNI prediction in cholangiocarcinoma that does not rely on predefined image features. NeoNet integrates three modules: (1) NeoSeg, utilizing a Tumor-Localized ROI Crop (TLCR) algorithm; (2) NeoGen, a 3D Latent Diffusion Model (LDM) with ControlNet, conditioned on anatomical masks to generate synthetic image patches, specifically balancing the dataset to a 1:1 ratio; and (3) NeoCls, the final prediction module. For NeoCls, we developed the PNI-Attention Network (PattenNet), which uses the frozen LDM encoder and specialized 3D Dual Attention Blocks (DAB) designed to detect subtle intensity variations and spatial patterns indicative of PNI. In 5-fold cross-validation, NeoNet outperformed baseline 3D models and achieved the highest performance with a maximum AUC of 0.7903.

NeoNet: An End-to-End 3D MRI-Based Deep Learning Framework for Non-Invasive Prediction of Perineural Invasion via Generation-Driven Classification

Abstract

Minimizing invasive diagnostic procedures to reduce the risk of patient injury and infection is a central goal in medical imaging. And yet, noninvasive diagnosis of perineural invasion (PNI), a critical prognostic factor involving infiltration of tumor cells along the surrounding nerve, still remains challenging, due to the lack of clear and consistent imaging criteria criteria for identifying PNI. To address this challenge, we present NeoNet, an integrated end-to-end 3D deep learning framework for PNI prediction in cholangiocarcinoma that does not rely on predefined image features. NeoNet integrates three modules: (1) NeoSeg, utilizing a Tumor-Localized ROI Crop (TLCR) algorithm; (2) NeoGen, a 3D Latent Diffusion Model (LDM) with ControlNet, conditioned on anatomical masks to generate synthetic image patches, specifically balancing the dataset to a 1:1 ratio; and (3) NeoCls, the final prediction module. For NeoCls, we developed the PNI-Attention Network (PattenNet), which uses the frozen LDM encoder and specialized 3D Dual Attention Blocks (DAB) designed to detect subtle intensity variations and spatial patterns indicative of PNI. In 5-fold cross-validation, NeoNet outperformed baseline 3D models and achieved the highest performance with a maximum AUC of 0.7903.

Paper Structure

This paper contains 35 sections, 5 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Morphological features of perineural invasion (PNI) demonstrated by histopathology (a), CT (b), and MRI (c-PNI negative, d-PNI positive). Arrows indicate the locations of characteristic PNI features in each modality.
  • Figure 2: Overview of the NeoNet framework. (1) NeoSeg performs automated liver and tumor segmentation on inputs (e.g., $240\times 240\times 100$), producing segmentation labels (e.g., $240\times 240\times 100$). The TLCR algorithm utilizes these labels and the original image to extract localized image and label patches ($96\times 96\times 48$). (2) NeoGen utilizes a 3D-LDM (Step #1) and trains ControlNet (Step #2) conditioned on label patches to generate synthetic image patches, balancing the dataset. (3) NeoCls utilizes PattenNet, initialized with the frozen LDM encoder ($\mathcal{E}$), to predict PNI status from real and synthetic patches.
  • Figure 3: Architecture of PattenNet for PNI prediction. The model utilizes a frozen encoder from the 3D-LDM followed by two Dual Attention Blocks (DAB). Each DAB integrates 3D channel and spatial attention. Spatial attention utilizes a 3D convolution ($7\times7\times7$) to effectively model 3D spatial relationships. The attended features are processed through global pooling and an MLP for binary classification.
  • Figure 4: Labels generated by NeoSeg. Real Image (left), Segmentation masks (middle: peritumoral mask in red, tumor in yellow), and Overlay (right).
  • Figure 5: Real Patch (a) and Synthetic Patch (b) generated by NeoGen using the corresponding anatomical mask. Morphological features are highlighted by red arrows.