Table of Contents
Fetching ...

Early Lung Cancer Diagnosis from Virtual Follow-up LDCT Generation via Correlational Autoencoder and Latent Flow Matching

Yutong Wu, Yifan Wang, Qining Zhang, Chuan Zhou, Lei Ying

TL;DR

The paper tackles the problem of delaying lung cancer diagnosis due to required follow-up imaging by proposing CorrFlowNet, a two-module framework that learns a correlated latent space for baseline and follow-up LDCT nodules and uses latent flow matching via a neural ODE to generate a virtual one-year follow-up. A correlational autoencoder captures dynamic progression while an auxiliary classifier guides the latent space toward malignancy-relevant features, and a background-aligned rectified flow module enables robust mapping between timepoints. On the NLST dataset, CorrFlowNet improves early diagnosis performance compared with baseline methods and achieves comparable results to real follow-up scans, demonstrating potential to accelerate decision-making and treatment planning. The approach highlights the value of combining progression-aware latent representations with diffusion-inspired generative mechanics for timely, radiology-driven cancer diagnosis and sets the stage for multi-modality extensions in clinical practice.

Abstract

Lung cancer is one of the most commonly diagnosed cancers, and early diagnosis is critical because the survival rate declines sharply once the disease progresses to advanced stages. However, achieving an early diagnosis remains challenging, particularly in distinguishing subtle early signals of malignancy from those of benign conditions. In clinical practice, a patient with a high risk may need to undergo an initial baseline and several annual follow-up examinations (e.g., CT scans) before receiving a definitive diagnosis, which can result in missing the optimal treatment. Recently, Artificial Intelligence (AI) methods have been increasingly used for early diagnosis of lung cancer, but most existing algorithms focus on radiomic features extraction from single early-stage CT scans. Inspired by recent advances in diffusion models for image generation, this paper proposes a generative method, named CorrFlowNet, which creates a virtual, one-year follow-up CT scan after the initial baseline scan. This virtual follow-up would allow for an early detection of malignant/benign nodules, reducing the need to wait for clinical follow-ups. During training, our approach employs a correlational autoencoder to encode both early baseline and follow-up CT images into a latent space that captures the dynamics of nodule progression as well as the correlations between them, followed by a flow matching algorithm on the latent space with a neural ordinary differential equation. An auxiliary classifier is used to further enhance the diagnostic accuracy. Evaluations on a real clinical dataset show our method can significantly improve downstream lung nodule risk assessment compared with existing baseline models. Moreover, its diagnostic accuracy is comparable with real clinical CT follow-ups, highlighting its potential to improve cancer diagnosis.

Early Lung Cancer Diagnosis from Virtual Follow-up LDCT Generation via Correlational Autoencoder and Latent Flow Matching

TL;DR

The paper tackles the problem of delaying lung cancer diagnosis due to required follow-up imaging by proposing CorrFlowNet, a two-module framework that learns a correlated latent space for baseline and follow-up LDCT nodules and uses latent flow matching via a neural ODE to generate a virtual one-year follow-up. A correlational autoencoder captures dynamic progression while an auxiliary classifier guides the latent space toward malignancy-relevant features, and a background-aligned rectified flow module enables robust mapping between timepoints. On the NLST dataset, CorrFlowNet improves early diagnosis performance compared with baseline methods and achieves comparable results to real follow-up scans, demonstrating potential to accelerate decision-making and treatment planning. The approach highlights the value of combining progression-aware latent representations with diffusion-inspired generative mechanics for timely, radiology-driven cancer diagnosis and sets the stage for multi-modality extensions in clinical practice.

Abstract

Lung cancer is one of the most commonly diagnosed cancers, and early diagnosis is critical because the survival rate declines sharply once the disease progresses to advanced stages. However, achieving an early diagnosis remains challenging, particularly in distinguishing subtle early signals of malignancy from those of benign conditions. In clinical practice, a patient with a high risk may need to undergo an initial baseline and several annual follow-up examinations (e.g., CT scans) before receiving a definitive diagnosis, which can result in missing the optimal treatment. Recently, Artificial Intelligence (AI) methods have been increasingly used for early diagnosis of lung cancer, but most existing algorithms focus on radiomic features extraction from single early-stage CT scans. Inspired by recent advances in diffusion models for image generation, this paper proposes a generative method, named CorrFlowNet, which creates a virtual, one-year follow-up CT scan after the initial baseline scan. This virtual follow-up would allow for an early detection of malignant/benign nodules, reducing the need to wait for clinical follow-ups. During training, our approach employs a correlational autoencoder to encode both early baseline and follow-up CT images into a latent space that captures the dynamics of nodule progression as well as the correlations between them, followed by a flow matching algorithm on the latent space with a neural ordinary differential equation. An auxiliary classifier is used to further enhance the diagnostic accuracy. Evaluations on a real clinical dataset show our method can significantly improve downstream lung nodule risk assessment compared with existing baseline models. Moreover, its diagnostic accuracy is comparable with real clinical CT follow-ups, highlighting its potential to improve cancer diagnosis.

Paper Structure

This paper contains 24 sections, 14 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Examples of early (T0) and follow-up (T1) lung LDCT scans for benign and malignant lung nodules.
  • Figure 2: Comparison between existing lung cancer diagnosis methods and our proposed approach: unlike the clinical convention where a follow-up examination after 6–12 months is typically required to finalize the diagnosis, our method predicts the follow-up lung CT at an early stage, enabling more timely diagnosis for lung cancer patients.
  • Figure 3: Overview of the proposed method. It consists of two modules: a correlational autoencoder and a latent flow matching module. In the correlational autoencoder, we first pre-train a base encoder-decoder pair ($E$, $D$) and then initialize the encoders $E_0$ and $E_1$ in the correlational autoencoder with the base encoder’s parameters $E$. Similarly, we initialize decoders in the correlational autoencoder $D_0$ and $D_1$ with $D$. In the latent flow matching stage, background alignment pretraining is performed, followed by rectified flow matching and the application of an auxiliary classifier loss $f_i$ to guide the neural ODE in learning class-specific attributes. During inference, the early baseline LDCT scan is used as input to generate follow-up-year CT embeddings $z_1$ for malignancy $\hat{y}$ prediction, along with a corresponding synthesized follow-up nodule image.