FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting

Jiaze E; Srutarshi Banerjee; Tekin Bicer; Guannan Wang; Yanfu Zhang; Bin Ren

FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting

Jiaze E, Srutarshi Banerjee, Tekin Bicer, Guannan Wang, Yanfu Zhang, Bin Ren

TL;DR

FCDM addresses the challenging problem of sparse-view sinogram inpainting in CT by integrating a physics-guided diffusion framework with a frequency-aware latent representation. It introduces Bidirectional Frequency-Domain Convolutions to disentangle spectral features along detector and angle axes, and enforces physical plausibility via physics-guided losses including Total Projection Consistency and Frequency Domain Consistency. The method further enhances denoising with Fourier-Enhanced Mask Embedding and Frequency-Adaptive Noise Scheduling, yielding robust, angularly coherent restorations. On real-world datasets, FCDM achieves $SSIM>0.93$ and $PSNR>31$ dB across various sparsity settings, outperforming diffusion-based and sinogram-specific baselines, with ablations confirming the contribution of each component. This approach offers a principled, geometry-aware pathway to high-fidelity sinogram restoration with potential to reduce radiation dose and scan time in CT applications.

Abstract

Computed tomography (CT) is widely used in scientific imaging systems such as synchrotron and laboratory-based nano-CT, but acquiring full-view sinograms requires high radiation dose and long scan times. Sparse-view CT alleviates this burden but yields incomplete sinograms with structured signal loss, hampering accurate reconstruction. Unlike RGB images, sinograms encode overlapping features along projection paths and exhibit distinct directional spectral patterns, which make conventional RGB-oriented inpainting approaches--including diffusion models--ineffective for sinogram restoration, as they disregard the angular dependencies and physical constraints inherent to tomographic data. To overcome these limitations, we propose FCDM, a diffusion-based framework tailored for sinograms, which restores global structure through bidirectional frequency reasoning and angular-aware masking, while enforcing physical plausibility via physics-guided constraints and frequency-adaptive noise control. Experiments on real-world datasets show that FCDM consistently outperforms baselines, achieving SSIM over 0.93 and PSNR above 31 dB across diverse sparse-view scenarios.

FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting

TL;DR

and

dB across various sparsity settings, outperforming diffusion-based and sinogram-specific baselines, with ablations confirming the contribution of each component. This approach offers a principled, geometry-aware pathway to high-fidelity sinogram restoration with potential to reduce radiation dose and scan time in CT applications.

Abstract

Paper Structure (30 sections, 14 equations, 3 figures, 8 tables)

This paper contains 30 sections, 14 equations, 3 figures, 8 tables.

Introduction
Related Work
FCDM Design and Analysis
Bidirectional Frequency-Domain Convolutions
Physics-Guided Loss Functions
Total Projection Consistency Loss
Frequency Domain Consistency Loss
Overall Loss
Fourier-Enhanced Mask Embedding
Frequency-Adaptive Noise Scheduling
Evaluation
Experimental Setup
Implementation Details
Experimental setup
Accuracy Comparisons with Baselines
...and 15 more sections

Figures (3)

Figure 1: Comparisons of RGB and sinogram and their spectra. Unlike RGB images, which have localized frequency components, sinograms exhibit structured spectral distributions due to the Radon transform.
Figure 2: Overview of FCDM. Stage 1 trains an encoder–decoder network equipped with BFDC to extract frequency– and geometry–aware latent representations. Stage 2 applies a diffusion-based model to perform latent-space inpainting guided by FEME and FANS. Here, $\mathcal{E}$ / $\mathcal{D}$ denote the encoder / decoder. $l_0$, $l_t$, $l_N$ denote latents in the diffusion process. $K$, $Q$, $V$ are query, key, and value matrices from the attention mechanism. $\mathcal{F}$ denote the Fourier transforms. $n_t$ is the time-dependent spectral weighting map. $\epsilon_t$, $\epsilon_t^{freq}$, and $\epsilon_t^{final}$ are the standard, frequency-adapted and final noises. $M(\theta)$ is the angle-dependent input mask, $\Phi(\theta)$ is the Fourier encoding operator applied to it, and $M'(\theta)$ is the resulting Fourier-enhanced mask embedding produced by FEME.
Figure 3: Visual comparison on TomoBank and LoDoPaB under random masking (ratio = 0.8). Rows 1 and 3 show the inpainted sinograms, while Rows 2 and 4 present the corresponding FBP ramachandran1971three-reconstructed images.

FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting

TL;DR

Abstract

FCDM: A Physics-Guided Bidirectional Frequency Aware Convolution and Diffusion-Based Model for Sinogram Inpainting

Authors

TL;DR

Abstract

Table of Contents

Figures (3)