Conditional Diffusion Model with Anatomical-Dose Dual Constraints for End-to-End Multi-Tumor Dose Prediction
Hui Xie, Haiqin Hu, Lijuan Ding, Qing Li, Yue Sun, Tao Tan
TL;DR
The paper tackles radiotherapy dose prediction by introducing ADDiff-Dose, a conditional diffusion model that operates in a compressed latent space produced by LightweightVAE3D and is guided by multimodal inputs including tumor/OAR masks, beam parameters, and over 50 clinical dose-volume constraints. A composite loss and an organ-existence gating mechanism ensure both dosimetric accuracy and strict clinical compliance, while a two-stage training regime enables end-to-end multi-tumor prediction across head-and-neck and lung cases. Empirical results on a large public dataset and three external cohorts show state-of-the-art MAE (≈0.101 Gy), Dice (≈0.927), and tight constraint adherence (e.g., spinal cord D_{max} near 0.0005 Gy), with plan-generation times around 22 seconds. The work demonstrates strong generalization, robustness, and clinical relevance, offering a scalable, automated alternative to traditional trial-and-error planning and a foundation for future expansion to more sites and techniques.
Abstract
Radiotherapy treatment planning often relies on time-consuming, trial-and-error adjustments that heavily depend on the expertise of specialists, while existing deep learning methods face limitations in generalization, prediction accuracy, and clinical applicability. To tackle these challenges, we propose ADDiff-Dose, an Anatomical-Dose Dual Constraints Conditional Diffusion Model for end-to-end multi-tumor dose prediction. The model employs LightweightVAE3D to compress high-dimensional CT data and integrates multimodal inputs, including target and organ-at-risk (OAR) masks and beam parameters, within a progressive noise addition and denoising framework. It incorporates conditional features via a multi-head attention mechanism and utilizes a composite loss function combining MSE, conditional terms, and KL divergence to ensure both dosimetric accuracy and compliance with clinical constraints. Evaluation on a large-scale public dataset (2,877 cases) and three external institutional cohorts (450 cases in total) demonstrates that ADDiff-Dose significantly outperforms traditional baselines, achieving an MAE of 0.101-0.154 (compared to 0.316 for UNet and 0.169 for GAN models), a DICE coefficient of 0.927 (a 6.8% improvement), and limiting spinal cord maximum dose error to within 0.1 Gy. The average plan generation time per case is reduced to 22 seconds. Ablation studies confirm that the structural encoder enhances compliance with clinical dose constraints by 28.5%. To our knowledge, this is the first study to introduce a conditional diffusion model framework for radiotherapy dose prediction, offering a generalizable and efficient solution for automated treatment planning across diverse tumor sites, with the potential to substantially reduce planning time and improve clinical workflow efficiency.
