SITCOM: Step-wise Triple-Consistent Diffusion Sampling for Inverse Problems
Ismail Alkhouri, Shijun Liang, Cheng-Han Huang, Jimmy Dai, Qing Qu, Saiprasad Ravishankar, Rongrong Wang
TL;DR
This paper tackles inverse problems with diffusion models by identifying three trajectory-consistency conditions and introducing SITCOM, an optimization-based sampler that enforces measurement, forward, and step-wise backward diffusion consistency at every sampling step. By optimizing over the DM input at each step and regularizing via the DM denoiser, SITCOM maintains diffusion trajectory fidelity while achieving data fidelity, enabling far fewer reverse steps without sacrificing quality. Empirical results across eight restoration tasks and one MRI reconstruction task show SITCOM delivering competitive or superior quantitative metrics (PSNR/SSIM/LPIPS) and significantly faster runtimes compared to state-of-the-art baselines. The approach also includes flexible extensions, such as latent SITCOM and SITCOM-ODE, and highlights potential applications to 3D and medical imaging, marking a practical advance in diffusion-based solving of ill-posed inverse problems.
Abstract
Diffusion models (DMs) are a class of generative models that allow sampling from a distribution learned over a training set. When applied to solving inverse problems, the reverse sampling steps are modified to approximately sample from a measurement-conditioned distribution. However, these modifications may be unsuitable for certain settings (e.g., presence of measurement noise) and non-linear tasks, as they often struggle to correct errors from earlier steps and generally require a large number of optimization and/or sampling steps. To address these challenges, we state three conditions for achieving measurement-consistent diffusion trajectories. Building on these conditions, we propose a new optimization-based sampling method that not only enforces standard data manifold measurement consistency and forward diffusion consistency, as seen in previous studies, but also incorporates our proposed step-wise and network-regularized backward diffusion consistency that maintains a diffusion trajectory by optimizing over the input of the pre-trained model at every sampling step. By enforcing these conditions (implicitly or explicitly), our sampler requires significantly fewer reverse steps. Therefore, we refer to our method as Step-wise Triple-Consistent Sampling (SITCOM). Compared to SOTA baselines, our experiments across several linear and non-linear tasks (with natural and medical images) demonstrate that SITCOM achieves competitive or superior results in terms of standard similarity metrics and run-time.
