Table of Contents
Fetching ...

Diffusion Low Rank Hybrid Reconstruction for Sparse View Medical Imaging

Zongyin Deng, Qing Zhou, Yuhao Fang, Zijian Wang, Yao Lu, Ye Zhang, Chun Li

TL;DR

The problem addressed is high-fidelity 3D CT reconstruction from extremely sparse-view, low-dose data. The authors propose TV-LoRA, a two-stage framework that couples a diffusion-based generative prior with anisotropic TV and low-rank (LoRA) regularization, solved via an ADMM scheme with embedded diffusion steps. Key contributions include integrating generative priors with structured regularization, an adaptive optimization strategy, and FFT-accelerated GPU implementation, validated across LDCT, CTHD, and LIDC with superior PSNR/SSIM, especially at $N_{\mathrm{view}}\in\{2,4,8\}$. The approach demonstrates theoretical $O(1/k)$ convergence, practical speedups, and robust performance, indicating strong potential for clinical applicability in sparse-sampling CT.

Abstract

This work presents TV-LoRA, a novel method for low-dose sparse-view CT reconstruction that combines a diffusion generative prior (NCSN++ with SDE modeling) and multi-regularization constraints, including anisotropic TV and nuclear norm (LoRA), within an ADMM framework. To address ill-posedness and texture loss under extremely sparse views, TV-LoRA integrates generative and physical constraints, and utilizes a 2D slice-based strategy with FFT acceleration and tensor-parallel optimization for efficient inference. Experiments on AAPM-2016, CTHD, and LIDC datasets with $N_{\mathrm{view}}=8,4,2$ show that TV-LoRA consistently surpasses benchmarks in SSIM, texture recovery, edge clarity, and artifact suppression, demonstrating strong robustness and generalizability. Ablation studies confirm the complementary effects of LoRA regularization and diffusion priors, while the FFT-PCG module provides a speedup. Overall, Diffusion + TV-LoRA achieves high-fidelity, efficient 3D CT reconstruction and broad clinical applicability in low-dose, sparse-sampling scenarios.

Diffusion Low Rank Hybrid Reconstruction for Sparse View Medical Imaging

TL;DR

The problem addressed is high-fidelity 3D CT reconstruction from extremely sparse-view, low-dose data. The authors propose TV-LoRA, a two-stage framework that couples a diffusion-based generative prior with anisotropic TV and low-rank (LoRA) regularization, solved via an ADMM scheme with embedded diffusion steps. Key contributions include integrating generative priors with structured regularization, an adaptive optimization strategy, and FFT-accelerated GPU implementation, validated across LDCT, CTHD, and LIDC with superior PSNR/SSIM, especially at . The approach demonstrates theoretical convergence, practical speedups, and robust performance, indicating strong potential for clinical applicability in sparse-sampling CT.

Abstract

This work presents TV-LoRA, a novel method for low-dose sparse-view CT reconstruction that combines a diffusion generative prior (NCSN++ with SDE modeling) and multi-regularization constraints, including anisotropic TV and nuclear norm (LoRA), within an ADMM framework. To address ill-posedness and texture loss under extremely sparse views, TV-LoRA integrates generative and physical constraints, and utilizes a 2D slice-based strategy with FFT acceleration and tensor-parallel optimization for efficient inference. Experiments on AAPM-2016, CTHD, and LIDC datasets with show that TV-LoRA consistently surpasses benchmarks in SSIM, texture recovery, edge clarity, and artifact suppression, demonstrating strong robustness and generalizability. Ablation studies confirm the complementary effects of LoRA regularization and diffusion priors, while the FFT-PCG module provides a speedup. Overall, Diffusion + TV-LoRA achieves high-fidelity, efficient 3D CT reconstruction and broad clinical applicability in low-dose, sparse-sampling scenarios.

Paper Structure

This paper contains 17 sections, 4 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: Quantitative comparison of sparse-view CT (SV-CT) reconstruction results on the LDCT dataset with $N_{\mathrm{view}} = 8, 4, 2$. (a) 8 views, (b) 4 views, (c) 2 views. The results demonstrate that the proposed method consistently preserves structural integrity and fine details across different sparsity levels, outperforming baseline approaches.
  • Figure 2: Comparison of TV-LoRA and seven representative baseline models on the (a) LDCT, (b) CTHD, and (c) LIDC datasets in terms of PSNR and SSIM for three orthogonal slices under varying numbers of projection views ($N_{\mathrm{view}}=2, 4, 8$).