BayesDiff: Estimating Pixel-wise Uncertainty in Diffusion via Bayesian Inference
Siqi Kou, Lei Gan, Dequan Wang, Chongxuan Li, Zhijie Deng
TL;DR
BayesDiff addresses the lack of a sample-wise quality metric for diffusion-generated images by estimating pixel-wise Bayesian uncertainty during image generation. It leverages a last-layer Laplace approximation to quantify predictive uncertainty of the noise predictor and derives an uncertainty iteration principle to propagate uncertainty through the reverse diffusion process. The approach enables image-level filtering, diverse augmentation, and artifact rectification for text-to-image tasks, with an efficient variant (BayesDiff-Skip) to reduce computational cost. Across multiple backbones and samplers, higher pixel-wise uncertainty correlates with clutter and misalignment, while uncertainty-guided resampling can rectify artifacts, demonstrating practical utility in real-world diffusion workflows.
Abstract
Diffusion models have impressive image generation capability, but low-quality generations still exist, and their identification remains challenging due to the lack of a proper sample-wise metric. To address this, we propose BayesDiff, a pixel-wise uncertainty estimator for generations from diffusion models based on Bayesian inference. In particular, we derive a novel uncertainty iteration principle to characterize the uncertainty dynamics in diffusion, and leverage the last-layer Laplace approximation for efficient Bayesian inference. The estimated pixel-wise uncertainty can not only be aggregated into a sample-wise metric to filter out low-fidelity images but also aids in augmenting successful generations and rectifying artifacts in failed generations in text-to-image tasks. Extensive experiments demonstrate the efficacy of BayesDiff and its promise for practical applications.
