Table of Contents
Fetching ...

SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

Ismail Alkhouri, Shijun Liang, Cheng-Han Huang, Jimmy Dai, Qing Qu, Saiprasad Ravishankar, Rongrong Wang

TL;DR

This paper tackles inverse problems with diffusion models by identifying three trajectory-consistency conditions and introducing SITCOM, an optimization-based sampler that enforces measurement, forward, and step-wise backward diffusion consistency at every sampling step. By optimizing over the DM input at each step and regularizing via the DM denoiser, SITCOM maintains diffusion trajectory fidelity while achieving data fidelity, enabling far fewer reverse steps without sacrificing quality. Empirical results across eight restoration tasks and one MRI reconstruction task show SITCOM delivering competitive or superior quantitative metrics (PSNR/SSIM/LPIPS) and significantly faster runtimes compared to state-of-the-art baselines. The approach also includes flexible extensions, such as latent SITCOM and SITCOM-ODE, and highlights potential applications to 3D and medical imaging, marking a practical advance in diffusion-based solving of ill-posed inverse problems.

Abstract

Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. When applied to solving inverse problems, the reverse sampling steps are modified to approximately sample from a measurement-conditioned distribution. However, these modifications may be unsuitable for certain settings (e.g., presence of measurement noise) and non-linear tasks, as they often struggle to correct errors from earlier steps and generally require a large number of optimization and/or sampling steps. To address these challenges, we state three conditions for achieving measurement-consistent diffusion trajectories. Building on these conditions, we propose a new optimization-based sampling method that not only enforces standard data manifold measurement consistency and forward diffusion consistency, as seen in previous studies, but also incorporates our proposed step-wise and network-regularized backward diffusion consistency that maintains a diffusion trajectory by optimizing over the input of the pre-trained model at every sampling step. By enforcing these conditions (implicitly or explicitly), our sampler requires significantly fewer reverse steps. Therefore, we refer to our method as Step-wise Triple-Consistent Sampling (SITCOM). Compared to SOTA baselines, our experiments across several linear and non-linear tasks (with natural and medical images) demonstrate that SITCOM achieves competitive or superior results in terms of standard similarity metrics and run-time.

SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems

TL;DR

This paper tackles inverse problems with diffusion models by identifying three trajectory-consistency conditions and introducing SITCOM, an optimization-based sampler that enforces measurement, forward, and step-wise backward diffusion consistency at every sampling step. By optimizing over the DM input at each step and regularizing via the DM denoiser, SITCOM maintains diffusion trajectory fidelity while achieving data fidelity, enabling far fewer reverse steps without sacrificing quality. Empirical results across eight restoration tasks and one MRI reconstruction task show SITCOM delivering competitive or superior quantitative metrics (PSNR/SSIM/LPIPS) and significantly faster runtimes compared to state-of-the-art baselines. The approach also includes flexible extensions, such as latent SITCOM and SITCOM-ODE, and highlights potential applications to 3D and medical imaging, marking a practical advance in diffusion-based solving of ill-posed inverse problems.

Abstract

Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. When applied to solving inverse problems, the reverse sampling steps are modified to approximately sample from a measurement-conditioned distribution. However, these modifications may be unsuitable for certain settings (e.g., presence of measurement noise) and non-linear tasks, as they often struggle to correct errors from earlier steps and generally require a large number of optimization and/or sampling steps. To address these challenges, we state three conditions for achieving measurement-consistent diffusion trajectories. Building on these conditions, we propose a new optimization-based sampling method that not only enforces standard data manifold measurement consistency and forward diffusion consistency, as seen in previous studies, but also incorporates our proposed step-wise and network-regularized backward diffusion consistency that maintains a diffusion trajectory by optimizing over the input of the pre-trained model at every sampling step. By enforcing these conditions (implicitly or explicitly), our sampler requires significantly fewer reverse steps. Therefore, we refer to our method as Step-wise Triple-Consistent Sampling (SITCOM). Compared to SOTA baselines, our experiments across several linear and non-linear tasks (with natural and medical images) demonstrate that SITCOM achieves competitive or superior results in terms of standard similarity metrics and run-time.
Paper Structure (42 sections, 1 theorem, 29 equations, 14 figures, 17 tables)

This paper contains 42 sections, 1 theorem, 29 equations, 14 figures, 17 tables.

Key Result

Proposition 4.1

SITCOM with $K=1$ is DPS with the resampling formula in eq:resample.

Figures (14)

  • Figure 1: Effects of enforcing backward-consistency in box-inpainting: Results of using Tweedie's formula without measurement consistency (columns 3 to 5), enforcing measurement-consistency via \ref{['eqn: x hat prime 0']} (columns 6 to 8), and enforcing both measurement-consistency and backward-consistency via \ref{['eq:main_opt']} (columns 9 to 11) at different time steps $t'$. Experimental details are given in Appendix \ref{['sec: append impact of backward consis interm']}.
  • Figure 2: Illustrative diagram of the proposed procedure in SITCOM (left). Conceptual illustration of SITCOM, where $\mathcal{M}_t$ is the DM generative manifold at time $t$ and $\mathcal{C}_t$ is the subset of images that are backward-consistent (right).
  • Figure 3: Results of applying optimization-based measurement consistency, for which the optimization variable is the DM output (resp. input), are shown in the first (resp. second) row for each task: Box Inpainting (top) and Gaussian Deblurring (bottom).
  • Figure 4: Results of Phase Retrieval on two images (top row) from the FFHQ dataset. Rows 2, 3, 4, and 5 correspond to the results of DPS, DAPS, and SITCOM (ours), and SITCOM-ODE (ours), respectively.
  • Figure 5: Reconstructed images using our proposed approach, SITCOM, and DM-based baselines (DDS and Score-MRI). Each row corresponds to a different mask pattern and acceleration factor. The ground truth and degraded images are shown in the first and second columns, respectively, followed by the reconstructed imaged from the baselines. The last column presents our method. PSNR results are given at the bottom of each reconstructed image. For all tasks, SITCOM reconstructions contain sharper and clearer image features than other methods.
  • ...and 9 more figures

Theorems & Definitions (6)

  • Definition 3.1: Backward Consistency
  • Remark 1
  • Remark 2
  • Proposition 4.1
  • proof
  • Remark 3