MInDI-3D: Iterative Deep Learning in 3D for Sparse-view Cone Beam Computed Tomography
Daniel Barco, Marc Stadelmann, Martin Oswald, Ivo Herzig, Lukas Lichtensteiger, Pascal Paysan, Igor Peterlik, Michal Walczak, Bjoern Menze, Frank-Peter Schilling
TL;DR
MInDI-3D extends InDI to fully 3D, enabling iterative diffusion-based artefact removal for sparse-view CBCT to reduce radiation exposure. By training on a large pseudo-CBCT dataset derived from CT-RATE and validating on a real-world HyperSight set, it demonstrates substantial PSNR/SSIM gains over uncorrected scans (e.g., +12.96 dB PSNR in some settings) and competitive performance relative to 3D U-Net baselines, with strong generalization across anatomies and projection levels. The method provides a controllable perception-distortion trade-off via iterative steps and receives positive clinician feedback for patient positioning tasks, while highlighting domain-shift challenges for dose calculation and contouring. Overall, MInDI-3D shows promise as a clinically viable tool for high-fidelity 3D CBCT reconstruction with radiation-dose reductions, supported by scalable data, extensive quantitative and clinical evaluations, and a publicly released pseudo-CBCT dataset to foster future research.
Abstract
We present MInDI-3D (Medical Inversion by Direct Iteration in 3D), the first 3D conditional diffusion-based model for real-world sparse-view Cone Beam Computed Tomography (CBCT) artefact removal, aiming to reduce imaging radiation exposure. A key contribution is extending the "InDI" concept from 2D to a full 3D volumetric approach for medical images, implementing an iterative denoising process that refines the CBCT volume directly from sparse-view input. A further contribution is the generation of a large pseudo-CBCT dataset (16,182) from chest CT volumes of the CT-RATE public dataset to robustly train MInDI-3D. We performed a comprehensive evaluation, including quantitative metrics, scalability analysis, generalisation tests, and a clinical assessment by 11 clinicians. Our results show MInDI-3D's effectiveness, achieving a 12.96 (6.10) dB PSNR gain over uncorrected scans with only 50 projections on the CT-RATE pseudo-CBCT (independent real-world) test set and enabling an 8x reduction in imaging radiation exposure. We demonstrate its scalability by showing that performance improves with more training data. Importantly, MInDI-3D matches the performance of a 3D U-Net on real-world scans from 16 cancer patients across distortion and task-based metrics. It also generalises to new CBCT scanner geometries. Clinicians rated our model as sufficient for patient positioning across all anatomical sites and found it preserved lung tumour boundaries well.
