Table of Contents
Fetching ...

Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

Hyungjin Chung, Dohoon Ryu, Michael T. McCann, Marc L. Klasky, Jong Chul Ye

TL;DR

This work tackles 3D inverse problems in medical imaging by integrating a pretrained 2D diffusion prior with a model-based regularizer, enabling coherent 3D reconstructions from severely undersampled measurements. The proposed DiffusionMBIR performs diffusion denoising slice-by-slice along the z-axis while enforcing cross-slice consistency through a 3D ADMM data-consistency step and a z-direction TV prior, achieving memory efficiency suitable for commodity GPUs. Across sparse-view CT, limited-angle CT, and compressed sensing MRI, it delivers state-of-the-art results and demonstrates robust generalization to out-of-distribution data, even with minimal 3D training data. The approach offers a practical, scalable route to high-fidelity 3D reconstructions by leveraging 2D diffusion priors and MBIR-driven optimization, with strong implications for clinical imaging workflows.

Abstract

Diffusion models have emerged as the new state-of-the-art generative model with high quality samples, with intriguing properties such as mode coverage and high flexibility. They have also been shown to be effective inverse problem solvers, acting as the prior of the distribution, while the information of the forward model can be granted at the sampling stage. Nonetheless, as the generative process remains in the same high dimensional (i.e. identical to data dimension) space, the models have not been extended to 3D inverse problems due to the extremely high memory and computational cost. In this paper, we combine the ideas from the conventional model-based iterative reconstruction with the modern diffusion models, which leads to a highly effective method for solving 3D medical image reconstruction tasks such as sparse-view tomography, limited angle tomography, compressed sensing MRI from pre-trained 2D diffusion models. In essence, we propose to augment the 2D diffusion prior with a model-based prior in the remaining direction at test time, such that one can achieve coherent reconstructions across all dimensions. Our method can be run in a single commodity GPU, and establishes the new state-of-the-art, showing that the proposed method can perform reconstructions of high fidelity and accuracy even in the most extreme cases (e.g. 2-view 3D tomography). We further reveal that the generalization capacity of the proposed method is surprisingly high, and can be used to reconstruct volumes that are entirely different from the training dataset.

Solving 3D Inverse Problems using Pre-trained 2D Diffusion Models

TL;DR

This work tackles 3D inverse problems in medical imaging by integrating a pretrained 2D diffusion prior with a model-based regularizer, enabling coherent 3D reconstructions from severely undersampled measurements. The proposed DiffusionMBIR performs diffusion denoising slice-by-slice along the z-axis while enforcing cross-slice consistency through a 3D ADMM data-consistency step and a z-direction TV prior, achieving memory efficiency suitable for commodity GPUs. Across sparse-view CT, limited-angle CT, and compressed sensing MRI, it delivers state-of-the-art results and demonstrates robust generalization to out-of-distribution data, even with minimal 3D training data. The approach offers a practical, scalable route to high-fidelity 3D reconstructions by leveraging 2D diffusion priors and MBIR-driven optimization, with strong implications for clinical imaging workflows.

Abstract

Diffusion models have emerged as the new state-of-the-art generative model with high quality samples, with intriguing properties such as mode coverage and high flexibility. They have also been shown to be effective inverse problem solvers, acting as the prior of the distribution, while the information of the forward model can be granted at the sampling stage. Nonetheless, as the generative process remains in the same high dimensional (i.e. identical to data dimension) space, the models have not been extended to 3D inverse problems due to the extremely high memory and computational cost. In this paper, we combine the ideas from the conventional model-based iterative reconstruction with the modern diffusion models, which leads to a highly effective method for solving 3D medical image reconstruction tasks such as sparse-view tomography, limited angle tomography, compressed sensing MRI from pre-trained 2D diffusion models. In essence, we propose to augment the 2D diffusion prior with a model-based prior in the remaining direction at test time, such that one can achieve coherent reconstructions across all dimensions. Our method can be run in a single commodity GPU, and establishes the new state-of-the-art, showing that the proposed method can perform reconstructions of high fidelity and accuracy even in the most extreme cases (e.g. 2-view 3D tomography). We further reveal that the generalization capacity of the proposed method is surprisingly high, and can be used to reconstruct volumes that are entirely different from the training dataset.
Paper Structure (17 sections, 21 equations, 10 figures, 4 tables, 2 algorithms)

This paper contains 17 sections, 21 equations, 10 figures, 4 tables, 2 algorithms.

Figures (10)

  • Figure 1: 3D reconstruction results with DiffusionMBIR. First row: measurement, second row: our method, third row: ground truth. Yellow inset: measurement process. Sparse-view tomography: 8-view measurement, Limited-angle tomography: [0 90]$^\circ$ out of [0 180]$^\circ$ angle measurement, Compressed-sensing MRI: 1D uniform sub-sampling of $\times 2$ acceleration. (In-distribution): test data aligned with training data, (Out-of-distribution): test data vastly different from training data.
  • Figure 2: Visualization of the measurement process for the three tasks we tackle in this work: (a) Limited angle CT (LA-CT)---measurement model of Fig. \ref{['fig:lact_results']}, (b) sparse view CT (SV-CT)---measurement model of Fig. \ref{['fig:8view_results']},\ref{['fig:4view_results']},\ref{['fig:2view_results']}, (c) compressed sensing MRI (CS-MRI)---measurement model of Fig. \ref{['fig:mri_results']}.
  • Figure 3: 8-view SV-CT reconstruction results of the test data (First row: axial slice, second row: sagittal slice, third row: coronal slice). (a) FBP, (b) ADMM-TV, (c) Lahiri et al.lahiri2022sparse, (d) Chung et al.chung2022improving, (e) proposed method, (f) ground truth. PSNR/SSIM values presented in the upper right corner. Green lines in the inset of first row (a): measured angles.
  • Figure 4: 90$^\circ$ LA-CT reconstruction results of the test data (First row: axial slice, second row: sagittal slice, third row: coronal slice). (a) FBP, (b) Zhang et al.zhang2016image, (c) Lahiri et al.lahiri2022sparse, (d) Chung et al.chung2022improving, (e) proposed method, (f) ground truth. PSNR/SSIM values presented in the upper right corner. Green area in the inset of first row (a): measured, Yellow area in the inset of first row (a): not measured.
  • Figure 5: 8-view SV-CT reconstruction results of the OOD data (Same geometry as in Fig. \ref{['fig:8view_results']}). (a) Ellipsis laid on top of the test data volume, (b) Phantom that consists of spheres located randomly.
  • ...and 5 more figures