Accelerated Bayesian imaging by relaxed proximal-point Langevin sampling

Teresa Klatzer; Paul Dobson; Yoann Altmann; Marcelo Pereyra; Jesús María Sanz-Serna; Konstantinos C. Zygalakis

Accelerated Bayesian imaging by relaxed proximal-point Langevin sampling

Teresa Klatzer, Paul Dobson, Yoann Altmann, Marcelo Pereyra, Jesús María Sanz-Serna, Konstantinos C. Zygalakis

TL;DR

This work tackles Bayesian imaging with convex geometry, addressing the computational bottlenecks of high-dimensional, non-smooth posteriors. It introduces IMLA, a stochastic relaxed proximal-point Langevin algorithm that, for smooth or Moreau‑Yosida smoothed models, acts as an implicit-midpoint discretization and achieves asymptotic unbiasedness for Gaussian targets while converging in an accelerated $\sqrt{\kappa}$ fashion for $\kappa$-strongly log-concave targets. For non-smooth targets, IMLA corresponds to a Leimkuhler–Matthews discretization of a Moreau‑Yosida posterior, yielding reduced bias compared with Euler–Maruyama-based methods. The authors provide non-asymptotic convergence bounds, identify optimal time-steps, and demonstrate superior performance on image deconvolution tasks with Gaussian and Poisson noise using assumption-driven and data-driven convex priors, including CRR‑NN priors; code is publicly available. The results establish IMLA as a robust, accelerated proximal-MCMC option for large-scale imaging with uncertainty quantification, while offering clear avenues for integration with more complex Bayesian schemes and mild extensions to non-convex settings.

Abstract

This paper presents a new accelerated proximal Markov chain Monte Carlo methodology to perform Bayesian inference in imaging inverse problems with an underlying convex geometry. The proposed strategy takes the form of a stochastic relaxed proximal-point iteration that admits two complementary interpretations. For models that are smooth or regularised by Moreau-Yosida smoothing, the algorithm is equivalent to an implicit midpoint discretisation of an overdamped Langevin diffusion targeting the posterior distribution of interest. This discretisation is asymptotically unbiased for Gaussian targets and shown to converge in an accelerated manner for any target that is $κ$-strongly log-concave (i.e., requiring in the order of $\sqrtκ$ iterations to converge, similarly to accelerated optimisation schemes), comparing favorably to [M. Pereyra, L. Vargas Mieles, K.C. Zygalakis, SIAM J. Imaging Sciences, 13,2 (2020), pp. 905-935] which is only provably accelerated for Gaussian targets and has bias. For models that are not smooth, the algorithm is equivalent to a Leimkuhler-Matthews discretisation of a Langevin diffusion targeting a Moreau-Yosida approximation of the posterior distribution of interest, and hence achieves a significantly lower bias than conventional unadjusted Langevin strategies based on the Euler-Maruyama discretisation. For targets that are $κ$-strongly log-concave, the provided non-asymptotic convergence analysis also identifies the optimal time step which maximizes the convergence speed. The proposed methodology is demonstrated through a range of experiments related to image deconvolution with Gaussian and Poisson noise, with assumption-driven and data-driven convex priors. Source codes for the numerical experiments of this paper are available from https://github.com/MI2G/accelerated-langevin-imla.

Accelerated Bayesian imaging by relaxed proximal-point Langevin sampling

TL;DR

fashion for

-strongly log-concave targets. For non-smooth targets, IMLA corresponds to a Leimkuhler–Matthews discretization of a Moreau‑Yosida posterior, yielding reduced bias compared with Euler–Maruyama-based methods. The authors provide non-asymptotic convergence bounds, identify optimal time-steps, and demonstrate superior performance on image deconvolution tasks with Gaussian and Poisson noise using assumption-driven and data-driven convex priors, including CRR‑NN priors; code is publicly available. The results establish IMLA as a robust, accelerated proximal-MCMC option for large-scale imaging with uncertainty quantification, while offering clear avenues for integration with more complex Bayesian schemes and mild extensions to non-convex settings.

Abstract

-strongly log-concave (i.e., requiring in the order of

iterations to converge, similarly to accelerated optimisation schemes), comparing favorably to [M. Pereyra, L. Vargas Mieles, K.C. Zygalakis, SIAM J. Imaging Sciences, 13,2 (2020), pp. 905-935] which is only provably accelerated for Gaussian targets and has bias. For models that are not smooth, the algorithm is equivalent to a Leimkuhler-Matthews discretisation of a Langevin diffusion targeting a Moreau-Yosida approximation of the posterior distribution of interest, and hence achieves a significantly lower bias than conventional unadjusted Langevin strategies based on the Euler-Maruyama discretisation. For targets that are

-strongly log-concave, the provided non-asymptotic convergence analysis also identifies the optimal time step which maximizes the convergence speed. The proposed methodology is demonstrated through a range of experiments related to image deconvolution with Gaussian and Poisson noise, with assumption-driven and data-driven convex priors. Source codes for the numerical experiments of this paper are available from https://github.com/MI2G/accelerated-langevin-imla.

Paper Structure (23 sections, 6 theorems, 89 equations, 13 figures, 6 tables, 2 algorithms)

This paper contains 23 sections, 6 theorems, 89 equations, 13 figures, 6 tables, 2 algorithms.

Introduction
Problem Statement
Bayesian inference and imaging inverse problems
Sampling using the overdamped Langevin equation
Non-smooth distributions
Proposed method: IMLA
Analysis in the Gaussian case
Analysis in the strongly log-concave case
Illustrative experiment: a toy denoising model based on a Gaussian mixture prior
Illustrative experiment: One dimensional distributions
Algorithmic recommendations: IMLA
Imaging experiments
Image deconvolution using a CRR-NN prior
Deconvolution with Poisson noise
Comparison according to integration time
...and 8 more sections

Key Result

Proposition 3.1

Let $\pi (x) \propto \exp{(-\frac{1}{2} x^T \Sigma^{-1} x)}$ with $\Sigma = \text{diag}(\sigma_1^2,...,\sigma_d^2)$, and let $Q_{n}$ be the probability measure associated with $n$ iterations of the generic Markov kernel eqn:generalNumMethod. Then the 2-Wasserstein distance between $\pi$ and $Q_{n}$ where In addition the following bound holds where is the numerical invariant measure and

Figures (13)

Figure 1: (a) step-size $\delta$ as a function of the condition number $\kappa$ for different choices of $\theta$ (the black dotted line corresponds to $\kappa\delta =2$, the stability limit of the explicit integrator) and level of accuracy $\epsilon$. (b) Number of steps $n$ such that $W_{2}(\pi,Q_{n})^{2}<\epsilon^{2}$ as a function of the condition number $\kappa$ for different choices of $\theta$ and $\epsilon$.
Figure 2: Top: Comparison of histograms of the scalar statistic $\log \pi(x)$ for the Gaussian mixture model. This figure shows the different biases induced by the sampling algorithms, and significant agreement between the exact and the IMLA statistic. Bottom: Traces for the same data showing the stationary behavior of the chains.
Figure 3: Histograms of the Laplace distribution (first column), the uniform distribution (second column), a light-tailed distribution with $\pi(x)\propto e^{-x^4}$ (third column), and a heavy-tailed Cauchy distribution (last column) computed with $15 \times 10^6$ iterations generated by MYULA (first row) and IMLA (second row) and ILA with $\theta=1$ (third row), for $\delta=0.05$ (left, right) and $\delta=10^{-4}$ (middle). The dashed black curve represents the probability density function of the respective distribution. Last row: Trace plots of the first $10^6$ iterations to show the behaviour of the chains.
Figure 4: Motion deconvolution experiment: Ground truth images $x$ (left) of size $320\times320$ with observations $y$ (right) for castle, person and lizard images (top to bottom) and their respective blur kernels in the top left corner of $y$.
Figure 5: MAP solution for the motion deconvolution problem using a gradient method goujon2022crrnn for the CRR-NN model (first line) and posterior means for IMLA, SKROCK, ULA and PnP-ULA (line 2-5) for castle, lizard and person images (left to right) including the corresponding PSNR values in $dB$.
...and 8 more figures

Theorems & Definitions (13)

Proposition 3.1
Proposition 3.2
proof
Theorem 3.3
proof
Proposition 3.4
Lemma A.1
proof : Proof of Lemma \ref{['lem:C']}
Proposition A.2
proof : Proof of Proposition \ref{['prop:theta_1_bias']}
...and 3 more

Accelerated Bayesian imaging by relaxed proximal-point Langevin sampling

TL;DR

Abstract

Accelerated Bayesian imaging by relaxed proximal-point Langevin sampling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (13)