Table of Contents
Fetching ...

Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models

Sho Ozaki, Shizuo Kaji, Toshikazu Imae, Kanabu Nawa, Hideomi Yamashita, Keiichi Nakagawa

TL;DR

This work tackles CT reconstruction under sparse-projection conditions by leveraging a latent-variable optimization of a shallow diffusion model (SDDPM) integrated with iterative reconstruction (IR). By stopping the diffusion process at a small $T$ and fixing a set of noises, the reverse process becomes deterministic, and the reconstruction optimizes the latent variable $x_T$ via the data-fidelity loss $L = || y - A x_0 ||^2$ with $x_0 = f_{\theta,T,\{u_i\}}(x_T)$. A single trained DDPM guides reconstructions to high-quality images while preserving anatomical structures, eliminating the need for explicit trade-off parameters between image quality and structure preservation. Experiments on 1/10 and 1/20 sparse projections show that the proposed IR+SDDPM (with $T=1$) outperforms IR, IR+TV, and DDPM alone in SSIM and PSNR, and that increasing iterations can compensate for higher sparsity. The framework promises broader applicability to other imaging modalities, including MRI, with potential speedups via acceleration techniques.

Abstract

Image-generative artificial intelligence (AI) has garnered significant attention in recent years. In particular, the diffusion model, a core component of generative AI, produces high-quality images with rich diversity. In this study, we proposed a novel computed tomography (CT) reconstruction method by combining the denoising diffusion probabilistic model with iterative CT reconstruction. In sharp contrast to previous studies, we optimized the fidelity loss of CT reconstruction with respect to the latent variable of the diffusion model, instead of the image and model parameters. To suppress the changes in anatomical structures produced by the diffusion model, we shallowed the diffusion and reverse processes and fixed a set of added noises in the reverse process to make it deterministic during the inference. We demonstrated the effectiveness of the proposed method through the sparse-projection CT reconstruction of 1/10 projection data. Despite the simplicity of the implementation, the proposed method has the potential to reconstruct high-quality images while preserving the patient's anatomical structures and was found to outperform existing methods, including iterative reconstruction, iterative reconstruction with total variation, and the diffusion model alone in terms of quantitative indices such as the structural similarity index and peak signal-to-noise ratio. We also explored further sparse-projection CT reconstruction using 1/20 projection data with the same trained diffusion model. As the number of iterations increased, the image quality improved comparable to that of 1/10 sparse-projection CT reconstruction. In principle, this method can be widely applied not only to CT but also to other imaging modalities.

Iterative CT Reconstruction via Latent Variable Optimization of Shallow Diffusion Models

TL;DR

This work tackles CT reconstruction under sparse-projection conditions by leveraging a latent-variable optimization of a shallow diffusion model (SDDPM) integrated with iterative reconstruction (IR). By stopping the diffusion process at a small and fixing a set of noises, the reverse process becomes deterministic, and the reconstruction optimizes the latent variable via the data-fidelity loss with . A single trained DDPM guides reconstructions to high-quality images while preserving anatomical structures, eliminating the need for explicit trade-off parameters between image quality and structure preservation. Experiments on 1/10 and 1/20 sparse projections show that the proposed IR+SDDPM (with ) outperforms IR, IR+TV, and DDPM alone in SSIM and PSNR, and that increasing iterations can compensate for higher sparsity. The framework promises broader applicability to other imaging modalities, including MRI, with potential speedups via acceleration techniques.

Abstract

Image-generative artificial intelligence (AI) has garnered significant attention in recent years. In particular, the diffusion model, a core component of generative AI, produces high-quality images with rich diversity. In this study, we proposed a novel computed tomography (CT) reconstruction method by combining the denoising diffusion probabilistic model with iterative CT reconstruction. In sharp contrast to previous studies, we optimized the fidelity loss of CT reconstruction with respect to the latent variable of the diffusion model, instead of the image and model parameters. To suppress the changes in anatomical structures produced by the diffusion model, we shallowed the diffusion and reverse processes and fixed a set of added noises in the reverse process to make it deterministic during the inference. We demonstrated the effectiveness of the proposed method through the sparse-projection CT reconstruction of 1/10 projection data. Despite the simplicity of the implementation, the proposed method has the potential to reconstruct high-quality images while preserving the patient's anatomical structures and was found to outperform existing methods, including iterative reconstruction, iterative reconstruction with total variation, and the diffusion model alone in terms of quantitative indices such as the structural similarity index and peak signal-to-noise ratio. We also explored further sparse-projection CT reconstruction using 1/20 projection data with the same trained diffusion model. As the number of iterations increased, the image quality improved comparable to that of 1/10 sparse-projection CT reconstruction. In principle, this method can be widely applied not only to CT but also to other imaging modalities.
Paper Structure (16 sections, 14 equations, 10 figures, 2 algorithms)

This paper contains 16 sections, 14 equations, 10 figures, 2 algorithms.

Figures (10)

  • Figure 1: Schema of the proposed method. a) Training of the diffusion model. In the training phase, we stop the diffusion process at a small $T$ such that the latent variable $x_{T}$ can partially contain information on the anatomical structures of a patient. b) Reconstruction using the trained model. At the beginning of the reconstruction phase, we reconstructed an initial image using IR and the observed projection data. Subsequently, we conducted a forward (diffusion) process using the initial image to obtain the latent variable. In the reverse process, noise in the latent variables is subtracted using the trained diffusion model. The final output of the model is a high-quality image. Reconstruction loss is computed using the output and observed projection data, and the loss function is optimized with respect to the latent variable $x_{T}$. c) Mapping from a latent variable to an image using the trained diffusion model. The diffusion model is trained to map the latent variables at a small $T$ to images within the high-quality region. Concurrently, reconstruction loss ensures that the output of the diffusion model preserves the anatomical structure. At the end of the optimization process, a high-quality image is reconstructed while maintaining the anatomical structures of the patient. Notably, the dimensions of the latent space are the same as those of the image space, for example, $256 \times 256$ or $512 \times 512$, although the spaces in the figure are depicted two-dimensionally (2D) for simple visualization.
  • Figure 2: Reconstruction loss as a function of iterations. The losses of the proposed method with fixed and random noises are shown. In this analysis, $T=100$ is used for the SDDPM.
  • Figure 3: Visual comparison of images reconstructed using the proposed method with fixed and random noises. The upper row shows the images reconstructed using the proposed method with fixed and random noises and the ground truth image. The ground truth image is reconstructed by IR using full scan projection, whereas other images are reconstructed by the proposed method using a 1/10 sparse projection. A display window of [-150, 200] HU is used for each image. The values of SSIM and PSNR are indicated. The lower row shows the absolute value of the difference between each image and the ground truth image. A display window of [0, 100] HU is used for the heat maps.
  • Figure 4: Reconstruction loss as a function of iterations. Losses from the proposed method with $T=1, 5, 10, 50, 100, 500$, and $1000$ are shown.
  • Figure 5: Visual comparison of the images reconstructed using the proposed method with $T=1, 5, 10, 50, 100, 500$, and $1000$. The first and third rows show the ground truth image reconstructed by IR using full scan projection and the images reconstructed by the proposed method using a 1/10 sparse projection. A display window of [-150, 200] HU is used for each image. The values of SSIM and PSNR are indicated. The second and fourth rows show the absolute values of the differences between each image and the ground truth image. A display window of [0, 100] HU is used for the heat maps.
  • ...and 5 more figures