Table of Contents
Fetching ...

Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs

Severi Rissanen, Markus Heinonen, Arno Solin

TL;DR

Free Hunch (FH) tackles the challenge of estimating denoiser covariance in diffusion models for training-free conditional generation, especially in linear inverse problems. It fuses two information streams—the data covariance learned from training data and curvature information from the generative trajectory via the second-order Tweedie formula—to produce accurate posterior moments $oldsymbol{ u}_{0|t}$ and $oldsymbol{ extSigma}_{0|t}$ without retraining. FH propagates covariance information across noise levels (time updates) and injects new low-rank information during sampling (space updates) using a BFGS-style scheme and efficient Woodbury-based inverses, enabling practical high-dimensional usage. In experiments on Gaussian mixtures and ImageNet-based linear inverse problems, FH consistently outperforms strong baselines at low step counts, reduces reliance on post-hoc scaling, and yields better LPIPS while preserving fine details. The work demonstrates that training-free covariance estimation can substantially improve conditional generation in diffusion models, broadening their applicability to inverse problems and controlled generation.

Abstract

The covariance for clean data given a noisy observation is an important quantity in many training-free guided generation methods for diffusion models. Current methods require heavy test-time computation, altering the standard diffusion training process or denoiser architecture, or making heavy approximations. We propose a new framework that sidesteps these issues by using covariance information that is available for free from training data and the curvature of the generative trajectory, which is linked to the covariance through the second-order Tweedie's formula. We integrate these sources of information using (i) a novel method to transfer covariance estimates across noise levels and (ii) low-rank updates in a given noise level. We validate the method on linear inverse problems, where it outperforms recent baselines, especially with fewer diffusion steps.

Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs

TL;DR

Free Hunch (FH) tackles the challenge of estimating denoiser covariance in diffusion models for training-free conditional generation, especially in linear inverse problems. It fuses two information streams—the data covariance learned from training data and curvature information from the generative trajectory via the second-order Tweedie formula—to produce accurate posterior moments and without retraining. FH propagates covariance information across noise levels (time updates) and injects new low-rank information during sampling (space updates) using a BFGS-style scheme and efficient Woodbury-based inverses, enabling practical high-dimensional usage. In experiments on Gaussian mixtures and ImageNet-based linear inverse problems, FH consistently outperforms strong baselines at low step counts, reduces reliance on post-hoc scaling, and yields better LPIPS while preserving fine details. The work demonstrates that training-free covariance estimation can substantially improve conditional generation in diffusion models, broadening their applicability to inverse problems and controlled generation.

Abstract

The covariance for clean data given a noisy observation is an important quantity in many training-free guided generation methods for diffusion models. Current methods require heavy test-time computation, altering the standard diffusion training process or denoiser architecture, or making heavy approximations. We propose a new framework that sidesteps these issues by using covariance information that is available for free from training data and the curvature of the generative trajectory, which is linked to the covariance through the second-order Tweedie's formula. We integrate these sources of information using (i) a novel method to transfer covariance estimates across noise levels and (ii) low-rank updates in a given noise level. We validate the method on linear inverse problems, where it outperforms recent baselines, especially with fewer diffusion steps.

Paper Structure

This paper contains 52 sections, 50 equations, 18 figures, 3 tables, 4 algorithms.

Figures (18)

  • Figure 1: Comparison of different conditional diffusion methods for deblurring, with a low number of solver steps (15 Heun iterations). DPS chungdiffusion and $\Pi$GDM song2023pseudoinverse work well with many steps, but accurate covariance estimates matter more for small step counts.
  • Figure 2: (a) A distribution $p({\bm{x}}_0)$ represented by a pretrained diffusion model, and a Gaussian likelihood $p({\bm{y}}\,|\,{\bm{x}}_0)$. (b) The (exact) posterior $p({\bm{x}}_0\,|\, {\bm{y}}) \sim p({\bm{x}}_0)p({\bm{y}}\,|\,{\bm{x}}_0)$. (c) Generated samples from a model with a heuristic diagonal denoiser covariance $\boldsymbol{\Sigma}_{0\,|\, t}({\bm{x}}_t)$, and a generative ODE trajectory with approximated $p({\bm{x}}_0\,|\, {\bm{x}}_t)$ shapes represented as ellipses along the trajectory. (d) Generated samples with our denoiser covariance.
  • Figure 3: Sketch of our method during sampling.
  • Figure 4: Norm of ${\bm{\mu}}_{0\,|\, t}({\bm{x}}) + \sigma(t)^2 \nabla_{{\bm{x}}_t}\log p({\bm{y}}\,|\, {\bm{x}}_t)$ for different covariance estimation methods on ImageNet 256$\times$256. Values ${>}1$ indicate overestimation since the data is normalized to $[-1,1]$.
  • Figure 5: Different methods for posterior inference in the example in \ref{['fig:2']} and Jensen--Shannon divergences to the true posterior.
  • ...and 13 more figures