Table of Contents
Fetching ...

DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction

Jiayang Shi, Daniel M. Pelt, K. Joost Batenburg

TL;DR

This work benchmarks ten recent diffusion-based methods alongside seven strong baselines, including model-based, unsupervised, and supervised approaches, and provides detailed insights into the behavior, strengths, and limitations of diffusion models for CT reconstruction.

Abstract

Diffusion models have recently emerged as powerful priors for solving inverse problems. While computed tomography (CT) is theoretically a linear inverse problem, it poses many practical challenges. These include correlated noise, artifact structures, reliance on system geometry, and misaligned value ranges, which make the direct application of diffusion models more difficult than in domains like natural image generation. To systematically evaluate how diffusion models perform in this context and compare them with established reconstruction methods, we introduce DM4CT, a comprehensive benchmark for CT reconstruction. DM4CT includes datasets from both medical and industrial domains with sparse-view and noisy configurations. To explore the challenges of deploying diffusion models in practice, we additionally acquire a high-resolution CT dataset at a high-energy synchrotron facility and evaluate all methods under real experimental conditions. We benchmark ten recent diffusion-based methods alongside seven strong baselines, including model-based, unsupervised, and supervised approaches. Our analysis provides detailed insights into the behavior, strengths, and limitations of diffusion models for CT reconstruction. The real-world dataset is publicly available at zenodo.org/records/15420527, and the codebase is open-sourced at github.com/DM4CT/DM4CT.

DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction

TL;DR

This work benchmarks ten recent diffusion-based methods alongside seven strong baselines, including model-based, unsupervised, and supervised approaches, and provides detailed insights into the behavior, strengths, and limitations of diffusion models for CT reconstruction.

Abstract

Diffusion models have recently emerged as powerful priors for solving inverse problems. While computed tomography (CT) is theoretically a linear inverse problem, it poses many practical challenges. These include correlated noise, artifact structures, reliance on system geometry, and misaligned value ranges, which make the direct application of diffusion models more difficult than in domains like natural image generation. To systematically evaluate how diffusion models perform in this context and compare them with established reconstruction methods, we introduce DM4CT, a comprehensive benchmark for CT reconstruction. DM4CT includes datasets from both medical and industrial domains with sparse-view and noisy configurations. To explore the challenges of deploying diffusion models in practice, we additionally acquire a high-resolution CT dataset at a high-energy synchrotron facility and evaluate all methods under real experimental conditions. We benchmark ten recent diffusion-based methods alongside seven strong baselines, including model-based, unsupervised, and supervised approaches. Our analysis provides detailed insights into the behavior, strengths, and limitations of diffusion models for CT reconstruction. The real-world dataset is publicly available at zenodo.org/records/15420527, and the codebase is open-sourced at github.com/DM4CT/DM4CT.
Paper Structure (29 sections, 23 equations, 19 figures, 15 tables, 2 algorithms)

This paper contains 29 sections, 23 equations, 19 figures, 15 tables, 2 algorithms.

Figures (19)

  • Figure 1: Overview of the DM4CT benchmark. (a) The reconstruction pipeline, where representative diffusion and baseline methods are applied to measured sinograms using the same forward model. (b) The datasets used in the benchmark, including two simulated CT datasets (medical and industrial) and one real-world dataset acquired at a synchrotron facility. (c) The five simulation configurations used to evaluate robustness to limited views, noise, and ring artifacts. Two example FBP reconstructions under noise and ring artifact conditions are shown. (d) The evaluation metrics, including both qualitative (visual) and quantitative (image quality and computational efficiency) criteria.
  • Figure 2: Reconstruction results of diffusion-based and other established methods. Top: medical dataset (config iv, 80 angles with noise & ring artifacts); middle: industrial dataset (config ii, 20 angles with mild noise); bottom: real-world synchrotron dataset (60 angles). Red and green boxes show zoom-in regions. PSNR and SSIM appear in the top-left and top-right of each image. A dash (–) indicates that the method exceeded the 40 GB GPU memory limit for single-slice reconstruction and is therefore not executed. Images are consistently linear rescaled across methods to improve contrast.
  • Figure 3: (a) Impact of data consistency step size $\eta$ (Equation \ref{['eq:data_consistency_update']}) on PSNR and data fit in DPS. Moderate values improve both, while large $\eta$ disrupts denoising and causes collapse. Visual examples in the plot highlight the transition from prior-dominated to noise-dominated reconstructions. (b) Mean and standard deviation of ten MCG reconstructions conditioned on the same real measurement. Note that the real measurement used in (b) is different from the one used for (a).
  • Figure 4: Decomposition of reconstructions into range and null space components for different data consistency strategies with config i). For each method, the full reconstruction is shown on the left, with zoomed-in red insets of the range component in the center and the corresponding null component on the right. The top-left of each null component indicates its relative L2 energy as a percentage of the total reconstruction, reflecting the extent of content introduced by the prior. Zoom in for details.
  • Figure 5: Reconstruction results of latent diffusion methods using only data consistency gradients (PSLD) versus additional optimization steps (ReSample) under noise-free (40 projections, no noise) and noisy (80 projections) scenarios. ADMM-PDTV serves as a classical model-based baseline that applies data consistency optimization with heuristic prior. Red insets show magnified regions.
  • ...and 14 more figures