End-to-End PET Image Reconstruction via a Posterior-Mean Diffusion Model
Yiran Sun, Osama Mawlawi
TL;DR
This work tackles the PET sinogram-to-PET reconstruction problem, where regression DL methods often blur details while posterior sampling methods risk artifacts. It introduces Posterior-Mean Denoising Diffusion Model (PMDM-PET), a two-stage approach that first learns an MMSE posterior-mean PET image from sinograms and then uses a conditional diffusion model to optimally transport that mean toward the ground-truth distribution under a perception constraint, grounded in the perception-distortion theory $D(0) = D^* + \min_{p_{\hat{\boldsymbol{x}}_0,\boldsymbol{x}_0^*}} \sum_t \mathrm{KL}(q(\hat{\boldsymbol{r}}_t|\hat{\boldsymbol{r}}_0,\boldsymbol{r}_0^*) \| p_\theta(\hat{\boldsymbol{r}}_t|\hat{\boldsymbol{r}}_{t+1},\boldsymbol{r}_0^*))$. The authors implement an MSE-trained $\boldsymbol{r}_0^*$ estimator (DeepPET) and a diffusion reverse process conditioned on $\boldsymbol{r}_0^*$, training with a denoising loss and evaluating on simulated BrainWeb data. They demonstrate that PMDM-PET yields higher PSNR and perceptual quality than five SOTA baselines, indicating a favorable distortion-perception tradeoff and potential clinical impact after further validation. Overall, the method advances PET reconstruction by jointly optimizing distortion and perceptual realism through a theoretically grounded, two-stage diffusion framework.
Abstract
Positron Emission Tomography (PET) is a functional imaging modality that enables the visualization of biochemical and physiological processes across various tissues. Recently, deep learning (DL)-based methods have demonstrated significant progress in directly mapping sinograms to PET images. However, regression-based DL models often yield overly smoothed reconstructions lacking of details (i.e., low distortion, low perceptual quality), whereas GAN-based and likelihood-based posterior sampling models tend to introduce undesirable artifacts in predictions (i.e., high distortion, high perceptual quality), limiting their clinical applicability. To achieve a robust perception-distortion tradeoff, we propose Posterior-Mean Denoising Diffusion Model (PMDM-PET), a novel approach that builds upon a recently established mathematical theory to explore the closed-form expression of perception-distortion function in diffusion model space for PET image reconstruction from sinograms. Specifically, PMDM-PET first obtained posterior-mean PET predictions under minimum mean square error (MSE), then optimally transports the distribution of them to the ground-truth PET images distribution. Experimental results demonstrate that PMDM-PET not only generates realistic PET images with possible minimum distortion and optimal perceptual quality but also outperforms five recent state-of-the-art (SOTA) DL baselines in both qualitative visual inspection and quantitative pixel-wise metrics PSNR (dB)/SSIM/NRMSE.
