Table of Contents
Fetching ...

Physics-informed DeepCT: Sinogram Wavelet Decomposition Meets Masked Diffusion

Zekun Zhou, Tan Liu, Bing Yu, Yanru Gong, Liu Shi, Qiegen Liu

Abstract

Diffusion model shows remarkable potential on sparse-view computed tomography (SVCT) reconstruction. However, when a network is trained on a limited sample space, its generalization capability may be constrained, which degrades performance on unfamiliar data. For image generation tasks, this can lead to issues such as blurry details and inconsistencies between regions. To alleviate this problem, we propose a Sinogram-based Wavelet random decomposition And Random mask diffusion Model (SWARM) for SVCT reconstruction. Specifically, introducing a random mask strategy in the sinogram effectively expands the limited training sample space. This enables the model to learn a broader range of data distributions, enhancing its understanding and generalization of data uncertainty. In addition, applying a random training strategy to the high-frequency components of the sinogram wavelet enhances feature representation and improves the ability to capture details in different frequency bands, thereby improving performance and robustness. Two-stage iterative reconstruction method is adopted to ensure the global consistency of the reconstructed image while refining its details. Experimental results demonstrate that SWARM outperforms competing approaches in both quantitative and qualitative performance across various datasets.

Physics-informed DeepCT: Sinogram Wavelet Decomposition Meets Masked Diffusion

Abstract

Diffusion model shows remarkable potential on sparse-view computed tomography (SVCT) reconstruction. However, when a network is trained on a limited sample space, its generalization capability may be constrained, which degrades performance on unfamiliar data. For image generation tasks, this can lead to issues such as blurry details and inconsistencies between regions. To alleviate this problem, we propose a Sinogram-based Wavelet random decomposition And Random mask diffusion Model (SWARM) for SVCT reconstruction. Specifically, introducing a random mask strategy in the sinogram effectively expands the limited training sample space. This enables the model to learn a broader range of data distributions, enhancing its understanding and generalization of data uncertainty. In addition, applying a random training strategy to the high-frequency components of the sinogram wavelet enhances feature representation and improves the ability to capture details in different frequency bands, thereby improving performance and robustness. Two-stage iterative reconstruction method is adopted to ensure the global consistency of the reconstructed image while refining its details. Experimental results demonstrate that SWARM outperforms competing approaches in both quantitative and qualitative performance across various datasets.
Paper Structure (32 sections, 32 equations, 10 figures, 6 tables, 1 algorithm)

This paper contains 32 sections, 32 equations, 10 figures, 6 tables, 1 algorithm.

Figures (10)

  • Figure 1: Different training strategies and influence of sinogram in deep learning. (a) Distribution of training data in a closed data space; (b) The distribution of the data obtained through the mask extension method in the extended data space; (c) The generation process of the random mask and how it is embedded in the data.
  • Figure 2: The pipeline of SWARM training process and iterative reconstruction procedure. Training stage: (a) A model training based on random masks in sinogram. (b) A model training for high-frequency random decomposition of wavelet based on sinogram. Iteration reconstruction stage: (c) The proposed SWARM method is used to reconstruct the sparse-view CT projection domain. "LF": Low-frequency. "HF": High-frequency.
  • Figure 3: Reconstruction images from 90 views using different methods with AAPM challenge data. (a) The reference image versus the images reconstructed by (b) FBP, (c) FBPConvNet, (d) HDNet, (e) GMSD, (f) SWORD, and (g) SWARM. The display window is [-480, 945] HU. The second line is the residual between the reference image and the reconstructed image.
  • Figure 4: Reconstruction images from 60 views using different methods with CIRS phantom data. (a) The reference image versus the images reconstructed by (b) FBP, (c) FBPConvNet, (d) HDNet, (e) GMSD, (f) SWORD, and (g) SWARM. The display window is [675, 1300] HU. The second line is the residual between the reference image and the reconstructed image.
  • Figure 5: Reconstruction images from 60 views using different methods with Dental Arch data. (a) The reference image versus the images reconstructed by (b) FBP, (c) FBPConvNet, (d) HDNet, (e) GMSD, (f) SWORD, and (g) SWARM. The display window is [-60, 1300] HU. The second line is the residual between the reference image and the reconstructed image.
  • ...and 5 more figures