Table of Contents
Fetching ...

4D VQ-GAN: Synthesising Medical Scans at Any Time Point for Personalised Disease Progression Modelling of Idiopathic Pulmonary Fibrosis

An Zhao, Moucheng Xu, Ahmed H. Shahin, Wim Wuyts, Mark G. Jones, Joseph Jacob, Daniel C. Alexander

TL;DR

This paper tackles the challenge of modeling IPF progression from high-dimensional CT volumes by learning a continuous 4D trajectory. It introduces 4D-VQ-GAN, which uses a 3D-VQ-GAN to encode CTs into a discrete latent space and a neural ODE to model temporal evolution, enabling generation of CTs at any time point from sparse observations. The authors validate the method on IPF CT data through interpolation/extrapolation tasks and survival-analysis-based biomarkers, showing generated scans preserve prognostic information comparable to real data. This work advances personalized disease progression modelling and holds potential for synthetic data, data augmentation, and improved treatment planning in IPF.

Abstract

Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. In this paper, we propose 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN), a model capable of generating realistic CT volumes of IPF patients at any time point. The model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) based temporal model is trained to capture the temporal dynamics of the quantised embeddings generated by the encoder in the first stage. We evaluate different configurations of our model for generating longitudinal CT scans and compare the results against ground truth data, both quantitatively and qualitatively. For validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index comparable to that of biomarkers derived from the real CT scans. The survival analysis results demonstrate the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes.

4D VQ-GAN: Synthesising Medical Scans at Any Time Point for Personalised Disease Progression Modelling of Idiopathic Pulmonary Fibrosis

TL;DR

This paper tackles the challenge of modeling IPF progression from high-dimensional CT volumes by learning a continuous 4D trajectory. It introduces 4D-VQ-GAN, which uses a 3D-VQ-GAN to encode CTs into a discrete latent space and a neural ODE to model temporal evolution, enabling generation of CTs at any time point from sparse observations. The authors validate the method on IPF CT data through interpolation/extrapolation tasks and survival-analysis-based biomarkers, showing generated scans preserve prognostic information comparable to real data. This work advances personalized disease progression modelling and holds potential for synthetic data, data augmentation, and improved treatment planning in IPF.

Abstract

Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. In this paper, we propose 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN), a model capable of generating realistic CT volumes of IPF patients at any time point. The model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) based temporal model is trained to capture the temporal dynamics of the quantised embeddings generated by the encoder in the first stage. We evaluate different configurations of our model for generating longitudinal CT scans and compare the results against ground truth data, both quantitatively and qualitatively. For validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index comparable to that of biomarkers derived from the real CT scans. The survival analysis results demonstrate the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes.

Paper Structure

This paper contains 23 sections, 7 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: The overview of our two-stage training strategies. The first stage trains an encoder-decoder-based 3D-VQ-GAN to reconstruct the CT volumes. The second stage takes the latent embeddings ($z_t, ..., z_0$) from the first stage, and trains a temporal model to reconstruct them. The temporal model consists of a 3D-ConvGRU, that compresses the temporal latent embeddings to match the dimensionality of the input of the ODE Solver to ease the computational burden. The projector, a light-weight 3D convolutional module reconstructs the temporal latent embeddings from the outputs of the ODE Solver. Those reconstructed latent embeddings are then fed into a frozen 3D-VQ-GAN Decoder from stage 1 for longitudinal CT reconstruction.
  • Figure 2: The inference of the trained model. Given two scans, the model can generate more scans up to the specified year at a specific interval.
  • Figure 3: Three real CT scans of an IPF patient are shown in the upper panel, representing axial, coronal, and sagittal sections. Using two scans from year 0 and year 2, the trained model can generate CT scans at any arbitrary time points. The below panel shows the generated CT images at five different time points, with three corresponding to the real scans. A zoomed region of the left lower lobe (yellow box) in the real and generated CT scans show comparable amounts of architectural distortion, patterned ground glass opacification and reticulation, all hallmarks of lung fibrosis.
  • Figure 4: Segmentation results for selected cases from Leuven cohort
  • Figure 5: Visualization of four registration outcomes with a focus on lung areas for clarity. The left two columns present axial views before and after registration, while the right two columns showcase coronal views. Baseline scans are denoted in blue, whereas follow-up scans are highlighted in yellow. The merging of colours results in grey or white hues, indicating aligned structures due to RGB amalgamation. Notably, follow-up scans are registered to their corresponding baseline scans. The first two rows illustrate cases with successful registration outcomes, while the subsequent two rows demonstrate instances of varying degrees of misalignment.
  • ...and 4 more figures