Table of Contents
Fetching ...

MAP Estimation with Denoisers: Convergence Rates and Guarantees

Scott Pesme, Giacomo Meanti, Michael Arbel, Julien Mairal

TL;DR

The paper tackles MAP estimation for inverse problems where the proximal step of the negative log-prior is intractable, and shows that a simple MMSE-averaging denoiser recursion converges to the proximal operator of $-\ln p$ under a log-concave prior. By interpreting the MMSE step via the Tweedie identity, the method becomes gradient descent on a sequence of smoothed proximal objectives $F_{\sigma}$, with provable $\tilde{O}(1/k)$ convergence to the true proximal point $\mathrm{prox}_{-\tau\ln p}(y)$ as the smoothing level vanishes. The results yield a parameter-free, practical algorithm and demonstrate how to integrate the recovered proximal into proximal-gradient MAP solvers with explicit convergence bounds. Extensions include affine-subspace priors, approximate score settings, and a pathway to combining with standard PGD for solving the MAP problem, thereby providing a solid theoretical foundation for denoiser-based inverse-problem solvers and bridging heuristic methods with rigorous optimisation guarantees.

Abstract

Denoiser models have become powerful tools for inverse problems, enabling the use of pretrained networks to approximate the score of a smoothed prior distribution. These models are often used in heuristic iterative schemes aimed at solving Maximum a Posteriori (MAP) optimisation problems, where the proximal operator of the negative log-prior plays a central role. In practice, this operator is intractable, and practitioners plug in a pretrained denoiser as a surrogate-despite the lack of general theoretical justification for this substitution. In this work, we show that a simple algorithm, closely related to several used in practice, provably converges to the proximal operator under a log-concavity assumption on the prior $p$. We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives. Our analysis thus provides a theoretical foundation for a class of empirically successful but previously heuristic methods.

MAP Estimation with Denoisers: Convergence Rates and Guarantees

TL;DR

The paper tackles MAP estimation for inverse problems where the proximal step of the negative log-prior is intractable, and shows that a simple MMSE-averaging denoiser recursion converges to the proximal operator of under a log-concave prior. By interpreting the MMSE step via the Tweedie identity, the method becomes gradient descent on a sequence of smoothed proximal objectives , with provable convergence to the true proximal point as the smoothing level vanishes. The results yield a parameter-free, practical algorithm and demonstrate how to integrate the recovered proximal into proximal-gradient MAP solvers with explicit convergence bounds. Extensions include affine-subspace priors, approximate score settings, and a pathway to combining with standard PGD for solving the MAP problem, thereby providing a solid theoretical foundation for denoiser-based inverse-problem solvers and bridging heuristic methods with rigorous optimisation guarantees.

Abstract

Denoiser models have become powerful tools for inverse problems, enabling the use of pretrained networks to approximate the score of a smoothed prior distribution. These models are often used in heuristic iterative schemes aimed at solving Maximum a Posteriori (MAP) optimisation problems, where the proximal operator of the negative log-prior plays a central role. In practice, this operator is intractable, and practitioners plug in a pretrained denoiser as a surrogate-despite the lack of general theoretical justification for this substitution. In this work, we show that a simple algorithm, closely related to several used in practice, provably converges to the proximal operator under a log-concavity assumption on the prior . We show that this algorithm can be interpreted as a gradient descent on smoothed proximal objectives. Our analysis thus provides a theoretical foundation for a class of empirically successful but previously heuristic methods.

Paper Structure

This paper contains 32 sections, 20 theorems, 164 equations, 2 figures, 1 algorithm.

Key Result

Proposition 1

The MMSE-averaging recursion with choice of weights $\alpha_k = 1 /(k+2)$ and noise sequence $\sigma_k^2 = \tau/(k+1)$ can be rewritten:

Figures (2)

  • Figure 1: Visualisation of the level curves of the smoothed proximal objective $F_\sigma(x) = \frac{1}{2} \| y - x \|^2 - \tau \ln p_\sigma(x)$ for different values of $\sigma$. The unsmoothed objective $F$ is poorly conditioned (left plot), but the conditioning improves significantly as $\sigma$ increases.
  • Figure 2: Illustration of the iterate trajectories (left plot) and convergence rates (right plot) of naive gradient descent on $F$ (which has condition number $\kappa = 500$) versus gradient descent on the smoothed objectives $(F_{\sigma_k})_k$, using a toy 2D Gaussian prior. Gradient descent on $F$, using a stepsize $\alpha = 0.8 / L_F$ (chosen for better visualisation), suffers from poor conditioning and makes little progress toward the optimal solution $\mathrm{prox}_{- \tau \ln p}(y)$. In contrast, gradient descent on the smoothed objectives $(F_{\sigma_k})_k$ converges rapidly, clearly exhibiting a $O(1/k)$ rate.

Theorems & Definitions (35)

  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Theorem 1: Convergence to the Proximal operator
  • Theorem 2: Convergence towards the MAP estimator with explicit bounds
  • Proposition 4
  • proof
  • Proposition 4
  • proof
  • Theorem 2: Convergence to the Proximal operator
  • ...and 25 more