Proximal Langevin Sampling With Inexact Proximal Mapping

Matthias J. Ehrhardt; Lorenz Kuger; Carola-Bibiane Schönlieb

Proximal Langevin Sampling With Inexact Proximal Mapping

Matthias J. Ehrhardt, Lorenz Kuger, Carola-Bibiane Schönlieb

TL;DR

This work addresses the challenge of drawing samples from log-concave, non-smooth posteriors in Bayesian imaging by allowing inexact proximal evaluations within a proximal Langevin framework. It extends proximal stochastic gradient Langevin dynamics to an inexact-proximal setting (iPGLA) and provides nonasymptotic and asymptotic convergence results in Wasserstein distance, quantifying the bias introduced by proximal errors and showing it vanishes when errors decay in strongly convex settings. The authors develop a rigorous epsilon-subdifferential-based analysis, relate sampling to optimization in the Wasserstein space, and validate the theory with imaging experiments including wavelet-based deblurring, TV denoising, and Poisson-informed deblurring to demonstrate practical trade-offs between proximal accuracy and sampling speed. The results enable efficient high-dimensional Bayesian imaging with inexact proximal computations, offering guidance on choosing inner-iteration budgets and step sizes to balance accuracy and compute.

Abstract

In order to solve tasks like uncertainty quantification or hypothesis tests in Bayesian imaging inverse problems, we often have to draw samples from the arising posterior distribution. For the usually log-concave but high-dimensional posteriors, Markov chain Monte Carlo methods based on time discretizations of Langevin diffusion are a popular tool. If the potential defining the distribution is non-smooth, these discretizations are usually of an implicit form leading to Langevin sampling algorithms that require the evaluation of proximal operators. For some of the potentials relevant in imaging problems this is only possible approximately using an iterative scheme. We investigate the behaviour of a proximal Langevin algorithm under the presence of errors in the evaluation of proximal mappings. We generalize existing non-asymptotic and asymptotic convergence results of the exact algorithm to our inexact setting and quantify the bias between the target and the algorithm's stationary distribution due to the errors. We show that the additional bias stays bounded for bounded errors and converges to zero for decaying errors in a strongly convex setting. We apply the inexact algorithm to sample numerically from the posterior of typical imaging inverse problems in which we can only approximate the proximal operator by an iterative scheme and validate our theoretical convergence results.

Proximal Langevin Sampling With Inexact Proximal Mapping

TL;DR

Abstract

Paper Structure (19 sections, 12 theorems, 71 equations, 11 figures, 1 algorithm)

This paper contains 19 sections, 12 theorems, 71 equations, 11 figures, 1 algorithm.

Introduction
Related Work
Contributions
A Langevin Sampling Algorithm with Inexact Proximal Points
Problem Formulation
Existing Proximal Langevin Sampling Schemes
A Notion of Inexactness for Proximal Points
The Proposed Inexact Sampling Scheme
Convergence Theory
Sampling as Optimization in the Wasserstein Space
Auxiliary Results on Inexact Proximal Points
Nonasymptotic and Asymptotic Convergence
Numerical Results
Validation on a Toy Example
Wavelet-Based Deblurring -- Comparison of Inexact and Exact PGLA
...and 4 more sections

Key Result

Lemma 3.1

Let assumption1assumption2assumption3 hold. For every $\mu \in \mathcal{P}_2(\mathcal{X})$, $\psi \in L^2(\mu^\ast;\mathcal{X})$ it holds $\mathcal{D}(\mu, \psi) \ge 0$ and $\mathcal{L}(\mu, \psi) \le \mathop{\mathrm{KL}}\nolimits(\mu,\mu^\ast)$. The pair $(\mu^\ast, \psi^\ast)$ is a saddle point of and further $\mathcal{L}(\mu^\ast, \psi) = 0$ if and only if $\psi = \psi^\ast$ holds $\mu^\ast$-a.

Figures (11)

Figure 1: Illustration of the inexact subgradient of the absolute value function and the corresponding inexact proximal points.
Figure 2: Visualization of the choice of step sizes in \ref{['rem:stepsize_choice_decaying_W2dist_inexact_PLA']} when $C=1$, $\lambda_F = 0.5$ and $L=1$ (left) and the corresponding values $A_K$ and the upper bound $M=M'+C'$ (right). For $k\ge 5$, it holds $\gamma_k = C'/k$.
Figure 3: Squared Wasserstein distances between target and samples generated by \ref{['algo:iPGLA']} in a 1D toy example. (a): Samples are drawn using a fixed step size $\gamma$ and fixed inexactness level $\varepsilon$. The computed squared distances $\mathcal{W}_2^2(\tilde{\mu}^k,\tilde{\mu}^\ast)$ are shown as solid lines, the corresponding upper bound predicted by \ref{['eq:type2_asymptotic_bound_primal_wasserstein_boundedErrors']} as dashed lines. (b): Samples are drawn using fixed step size $\gamma$ and inexactness $\varepsilon_k \propto k^{-\beta}$ decaying to zero at different rates, with the corresponding upper bounds \ref{['eq:type2_asymptotic_bound_primal_wasserstein_decreasingErrors']}. (c): Samples are drawn with decaying step sizes $\gamma_k$ chosen as in \ref{['rem:stepsize_choice_decaying_W2dist_inexact_PLA']} and $\varepsilon_k \propto k^{\beta}$ for different rates $\beta<0$. The convergence to the target shown in \ref{['thm:asymptotic_result_type2error_decaying_stepsizes']} depends in its rate on the decay rate of $\varepsilon_k$.
Figure 4: Comparison of exact and inexact version of PGLA for wavelet-based deblurring. For the inexact version we use errors $\varepsilon = C_0 \tilde{\varepsilon}$.
Figure 5: Test image for the TV-denoising experiment. The MMSE is computed using $10^5$ samples \ref{['algo:iPGLA']} with a small error threshold ($\tilde{\varepsilon} = 10^{-5}$).
...and 6 more figures

Theorems & Definitions (28)

Example 2.1
Definition 2.5: Rasch2020
Example 2.6
Example 2.7
Lemma 3.1: Salim2020
Lemma 3.2: see Thm 2.4.4 (iv) in Zalinescu2002
Lemma 3.3: see Lemma 1 in Salzo2012
Lemma 3.4
Remark 3.5
proof
...and 18 more

Proximal Langevin Sampling With Inexact Proximal Mapping

TL;DR

Abstract

Proximal Langevin Sampling With Inexact Proximal Mapping

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (28)