Longitudinal NSCLC Treatment Progression via Multimodal Generative Models

Massimiliano Mantegna; Elena Mulero Ayllón; Alice Natalina Caragliano; Francesco Di Feola; Claudia Tacconi; Michele Fiore; Edy Ippolito; Carlo Greco; Sara Ramella; Philippe C. Cattin; Paolo Soda; Matteo Tortora; Valerio Guarrasi

Longitudinal NSCLC Treatment Progression via Multimodal Generative Models

Massimiliano Mantegna, Elena Mulero Ayllón, Alice Natalina Caragliano, Francesco Di Feola, Claudia Tacconi, Michele Fiore, Edy Ippolito, Carlo Greco, Sara Ramella, Philippe C. Cattin, Paolo Soda, Matteo Tortora, Valerio Guarrasi

TL;DR

Quantitative and qualitative results indicate that diffusion-based models benefit more consistently from multimodal, dose-aware conditioning and produce more stable and anatomically plausible tumor evolution trajectories than GAN-based baselines, supporting the potential of VT as a tool for in-silico treatment monitoring and adaptive radiotherapy research in NSCLC.

Abstract

Predicting tumor evolution during radiotherapy is a clinically critical challenge, particularly when longitudinal changes are driven by both anatomy and treatment. In this work, we introduce a Virtual Treatment (VT) framework that formulates non-small cell lung cancer (NSCLC) progression as a dose-aware multimodal conditional image-to-image translation problem. Given a CT scan, baseline clinical variables, and a specified radiation dose increment, VT aims to synthesize plausible follow-up CT images reflecting treatment-induced anatomical changes. We evaluate the proposed framework on a longitudinal dataset of 222 stage III NSCLC patients, comprising 895 CT scans acquired during radiotherapy under irregular clinical schedules. The generative process is conditioned on delivered dose increments together with demographic and tumor-related clinical variables. Representative GAN-based and diffusion-based models are benchmarked across 2D and 2.5D configurations. Quantitative and qualitative results indicate that diffusion-based models benefit more consistently from multimodal, dose-aware conditioning and produce more stable and anatomically plausible tumor evolution trajectories than GAN-based baselines, supporting the potential of VT as a tool for in-silico treatment monitoring and adaptive radiotherapy research in NSCLC.

Longitudinal NSCLC Treatment Progression via Multimodal Generative Models

TL;DR

Abstract

Paper Structure (15 sections, 17 equations, 4 figures)

This paper contains 15 sections, 17 equations, 4 figures.

Introduction
Materials
Preprocessing
Methods
Problem Setting
Virtual Treatment as a generative formulation
Experimental Setup
Implementation and Training Details
2D GAN-Based Models
2.5D Diffusion-Based Model
Results
Tumor Segmentation Evaluation
Computational Cost Analysis
Qualitative Results
Conclusions

Figures (4)

Figure 1: Overview of the proposed multimodal VT framework. (A) Problem setting. Multimodal conditioning for virtual treatment forecasting (green: input CT and baseline clinical variables; pink: dose increment). (B) Generative formulation. The generator $G_{\bm{\theta}}$ synthesizes follow-up CTs conditioned on input CT, clinical features, and dose increment, optimized with a reconstruction loss and a tumor-focused term within the CTV (orange). Inference follows the direct dose-response strategy (blue) explained in \ref{['sec:methods']}.
Figure 2: Percentage volumetric discrepancy $|\Delta V|$ between real and generated segmentations as a function of the dose increment ($\delta Dose$ (Gy)), aggregated into discrete dose bins (10--60 Gy). Each curve represents one generative model.
Figure 3: Comparison of models in terms of $|\Delta V|$ error (y-axis) and computational cost measured in GMACs (x-axis, log scale) for the training (darker bubbles) and inference (lighter bubbles) phases. Bubble size is proportional to the number of model parameters (10M and 30M).
Figure 4: Qualitative comparison across increasing dose levels for a representative patient treated with a 2 Gy/day fractionation protocol. Columns correspond to increasing cumulative dose, while rows show the GT follow-up CT and the predictions generated by each model. The red contour indicates the CTV region used for Local analysis.

Longitudinal NSCLC Treatment Progression via Multimodal Generative Models

TL;DR

Abstract

Longitudinal NSCLC Treatment Progression via Multimodal Generative Models

Authors

TL;DR

Abstract

Table of Contents

Figures (4)