Table of Contents
Fetching ...

CT-Conditioned Diffusion Prior with Physics-Constrained Sampling for PET Super-Resolution

Liutao Yang, Zi Wang, Peiyuan Jing, Xiaowen Wang, Javier A. Montoya-Zegarra, Kuangyu Shi, Daoqiang Zhang, Guang Yang

Abstract

PET super-resolution is highly under-constrained because paired multi-resolution scans from the same subject are rarely available, and effective resolution is determined by scanner-specific physics (e.g., PSF, detector geometry, and acquisition settings). This limits supervised end-to-end training and makes purely image-domain generative restoration prone to hallucinated structures when anatomical and physical constraints are weak. We formulate PET super-resolution as posterior inference under heterogeneous system configurations and propose a CT-conditioned diffusion framework with physics-constrained sampling. During training, a conditional diffusion prior is learned from high-quality PET/CT pairs using cross-attention for anatomical guidance, without requiring paired LR--HR PET data. During inference, measurement consistency is enforced through a scanner-aware forward model with explicit PSF effects and gradient-based data-consistency refinement. Under both standard and OOD settings, the proposed method consistently improves experimental metrics and lesion-level clinical relevance indicators over strong baselines, while reducing hallucination artifacts and improving structural fidelity.

CT-Conditioned Diffusion Prior with Physics-Constrained Sampling for PET Super-Resolution

Abstract

PET super-resolution is highly under-constrained because paired multi-resolution scans from the same subject are rarely available, and effective resolution is determined by scanner-specific physics (e.g., PSF, detector geometry, and acquisition settings). This limits supervised end-to-end training and makes purely image-domain generative restoration prone to hallucinated structures when anatomical and physical constraints are weak. We formulate PET super-resolution as posterior inference under heterogeneous system configurations and propose a CT-conditioned diffusion framework with physics-constrained sampling. During training, a conditional diffusion prior is learned from high-quality PET/CT pairs using cross-attention for anatomical guidance, without requiring paired LR--HR PET data. During inference, measurement consistency is enforced through a scanner-aware forward model with explicit PSF effects and gradient-based data-consistency refinement. Under both standard and OOD settings, the proposed method consistently improves experimental metrics and lesion-level clinical relevance indicators over strong baselines, while reducing hallucination artifacts and improving structural fidelity.
Paper Structure (17 sections, 9 equations, 4 figures, 3 tables)

This paper contains 17 sections, 9 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Conceptual motivation and method positioning for PET super-resolution. (a) PET super-resolution is ill-posed due to scanner-dependent degradation and limited paired data. (b) Supervised CNN mapping depends on paired LR--HR data and tends to over-smooth details. (c) Image-domain diffusion can recover texture but may introduce hallucinations without structural/physics constraints. (d) Our method combines CT-conditioned priors with scanner-aware, PSF-aware consistency for faithful and physically consistent reconstruction.
  • Figure 2: Overview of the proposed CT-guided diffusion framework. Pretraining (a): learn a CT-conditioned diffusion prior from high-quality PET/CT.Inference (b): start from noisy PET, iteratively denoise, and apply measurement-domain data-consistency updates. PPCR: progressively enforces physics constraints (late activation and coarse-to-fine PSF) to produce a scanner-consistent high-resolution PET output.
  • Figure 3: Qualitative comparison under the standard setting (8 mm, SR$\times$4).
  • Figure 4: Lesion-level quantitative comparison on two lung cancer cases. Each lesion reports $\Delta$SUVmax, $\Delta$SUVmean, and lesion NMSE within the lesion mask; lower values indicate better consistency.