Benchmarking Uncertainty Quantification of Plug-and-Play Diffusion Priors for Inverse Problems Solving
Xiaoyu Qiu, Taewon Yang, Zhanhao Liu, Guanyang Wang, Liyue Shen
TL;DR
The work addresses the gap in evaluating uncertainty for Plug-and-Play Diffusion Priors in inverse problems, where the target is a posterior distribution rather than a single reconstruction. It introduces a unified uncertainty-aware benchmark combining a controlled toy model with a taxonomy of solvers and extensive real-data experiments (including OOD tasks) to study epistemic and aleatoric uncertainty. Key findings show that posterior-targeting solvers tend to calibrate uncertainty more faithfully than heuristic or MAP-like methods, while accuracy and uncertainty can be decoupled and uncertainty generally grows with data sparsity; OOD scenarios reveal distinct, task-specific uncertainty patterns. The paper provides practical guidance for evaluating and designing UQ-conscious diffusion samplers, highlighting the need for uncertainty-aware benchmarks in scientific inverse problems.
Abstract
Plug-and-play diffusion priors (PnPDP) have become a powerful paradigm for solving inverse problems in scientific and engineering domains. Yet, current evaluations of reconstruction quality emphasize point-estimate accuracy metrics on a single sample, which do not reflect the stochastic nature of PnPDP solvers and the intrinsic uncertainty of inverse problems, critical for scientific tasks. This creates a fundamental mismatch: in inverse problems, the desired output is typically a posterior distribution and most PnPDP solvers induce a distribution over reconstructions, but existing benchmarks only evaluate a single reconstruction, ignoring distributional characterization such as uncertainty. To address this gap, we conduct a systematic study to benchmark the uncertainty quantification (UQ) of existing diffusion inverse solvers. Specifically, we design a rigorous toy model simulation to evaluate the uncertainty behavior of various PnPDP solvers, and propose a UQ-driven categorization. Through extensive experiments on toy simulations and diverse real-world scientific inverse problems, we observe uncertainty behaviors consistent with our taxonomy and theoretical justification, providing new insights for evaluating and understanding the uncertainty for PnPDPs.
