Table of Contents
Fetching ...

Classification of lung cancer subtypes on CT images with synthetic pathological priors

Wentao Zhu, Yuan Jin, Gege Ma, Geng Chen, Jan Egger, Shaoting Zhang, Dimitris N. Metaxas

TL;DR

This work tackles automatic CT-based classification of lung cancer subtypes (LUAD vs LUSC) by introducing SGHF-Net, a multi-module framework that learns pathologically informed priors from paired CT and histopathology data and fuses them with radiological features. The Pathological Feature Synthetic Module (PFSM) maps cross-scale CT–pathology associations and synthesizes a 512×1 pathology-like feature from CT via a CGAN, while the Radiological Feature Extraction Module (RFEM) extracts CT radiological features that are fused with PFSM priors for final classification. The approach achieves state-of-the-art accuracy and AUC on a large multi-center dataset, with ablation analyses confirming the complementary roles of PFSM and RFEM; external validation across additional centers demonstrates robustness though multi-center variability remains a challenge. The method demonstrates the practical potential of cross-modality priors to enhance noninvasive imaging-based cancer diagnostics and can be extended to broader multi-modal medical tasks.

Abstract

The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements. In this paper, we propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on computed tomography (CT) images. Inspired by studies stating that cross-scale associations exist in the image patterns between the same case's CT images and its pathological images, we innovatively developed a pathological feature synthetic module (PFSM), which quantitatively maps cross-modality associations through deep neural networks, to derive the "gold standard" information contained in the corresponding pathological images from CT images. Additionally, we designed a radiological feature extraction module (RFEM) to directly acquire CT image information and integrated it with the pathological priors under an effective feature fusion framework, enabling the entire classification model to generate more indicative and specific pathologically related features and eventually output more accurate predictions. The superiority of the proposed model lies in its ability to self-generate hybrid features that contain multi-modality image information based on a single-modality input. To evaluate the effectiveness, adaptability, and generalization ability of our model, we performed extensive experiments on a large-scale multi-center dataset (i.e., 829 cases from three hospitals) to compare our model and a series of state-of-the-art (SOTA) classification models. The experimental results demonstrated the superiority of our model for lung cancer subtypes classification with significant accuracy improvements in terms of accuracy (ACC), area under the curve (AUC), and F1 score.

Classification of lung cancer subtypes on CT images with synthetic pathological priors

TL;DR

This work tackles automatic CT-based classification of lung cancer subtypes (LUAD vs LUSC) by introducing SGHF-Net, a multi-module framework that learns pathologically informed priors from paired CT and histopathology data and fuses them with radiological features. The Pathological Feature Synthetic Module (PFSM) maps cross-scale CT–pathology associations and synthesizes a 512×1 pathology-like feature from CT via a CGAN, while the Radiological Feature Extraction Module (RFEM) extracts CT radiological features that are fused with PFSM priors for final classification. The approach achieves state-of-the-art accuracy and AUC on a large multi-center dataset, with ablation analyses confirming the complementary roles of PFSM and RFEM; external validation across additional centers demonstrates robustness though multi-center variability remains a challenge. The method demonstrates the practical potential of cross-modality priors to enhance noninvasive imaging-based cancer diagnostics and can be extended to broader multi-modal medical tasks.

Abstract

The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements. In this paper, we propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on computed tomography (CT) images. Inspired by studies stating that cross-scale associations exist in the image patterns between the same case's CT images and its pathological images, we innovatively developed a pathological feature synthetic module (PFSM), which quantitatively maps cross-modality associations through deep neural networks, to derive the "gold standard" information contained in the corresponding pathological images from CT images. Additionally, we designed a radiological feature extraction module (RFEM) to directly acquire CT image information and integrated it with the pathological priors under an effective feature fusion framework, enabling the entire classification model to generate more indicative and specific pathologically related features and eventually output more accurate predictions. The superiority of the proposed model lies in its ability to self-generate hybrid features that contain multi-modality image information based on a single-modality input. To evaluate the effectiveness, adaptability, and generalization ability of our model, we performed extensive experiments on a large-scale multi-center dataset (i.e., 829 cases from three hospitals) to compare our model and a series of state-of-the-art (SOTA) classification models. The experimental results demonstrated the superiority of our model for lung cancer subtypes classification with significant accuracy improvements in terms of accuracy (ACC), area under the curve (AUC), and F1 score.
Paper Structure (22 sections, 16 equations, 7 figures, 4 tables)

This paper contains 22 sections, 16 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: The detailed image patterns in the paired CT images and pathological images of LUAD case (a) $\&$ (b), and LUSC case (c) $\&$ (d).
  • Figure 2: Pipeline of the proposed novel CT-based classification model, SGHF-Net: $P=C(F(f_{\text{p}},f_{\text{r}}))$, where $C(\cdot)$ denotes the classification component; $F(\cdot)$ denotes the fusion module; $f_{\text{p}}$ and $f_{\text{r}}$ are the pathological and radiological features, respectively.
  • Figure 3: The training procedure of the PFSM (notably, the pathological images, as the gold-standard reference images, are only required during the training process of the PFSM.)
  • Figure 4: The workflow of pathological images and CT images preprocessing: (a) obtaining the original WSI (b) conducting color normalization (c) processing ROI delineation (d) generating patches from WSI with constant size (e) obtaining the original CT image (f) segmenting masks (g) generating patch from CT with constant size.
  • Figure 5: The ablation test of the PFSM in terms of ACC, AUC and F1 score.
  • ...and 2 more figures