Table of Contents
Fetching ...

Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators

Youguang Chen, George Biros

TL;DR

This work introduces Proximal-IMH, a scheme that removes bias from an approximate posterior distribution by correcting samples from the approximate posterior through an auxiliary optimization problem, and proves that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing.

Abstract

We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common in Bayesian inference. Relying on the existence of an approximate posterior distribution that is cheaper to sample from but may have significant bias, we introduce Proximal-IMH, a scheme that removes this bias by correcting samples from the approximate posterior through an auxiliary optimization problem. This yields a local adjustment that trades off adherence to the exact model against stability around the approximate reference point. For idealized settings, we prove that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing. The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive. We present numerical experiments including multimodal and data-driven priors with nonlinear input-output operators. The results show that Proximal-IMH reliably outperforms existing IMH variants.

Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators

TL;DR

This work introduces Proximal-IMH, a scheme that removes bias from an approximate posterior distribution by correcting samples from the approximate posterior through an auxiliary optimization problem, and proves that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing.

Abstract

We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common in Bayesian inference. Relying on the existence of an approximate posterior distribution that is cheaper to sample from but may have significant bias, we introduce Proximal-IMH, a scheme that removes this bias by correcting samples from the approximate posterior through an auxiliary optimization problem. This yields a local adjustment that trades off adherence to the exact model against stability around the approximate reference point. For idealized settings, we prove that the proximal correction tightens the match between approximate and exact posteriors, thereby improving acceptance rates and mixing. The method applies to both linear and nonlinear input-output operators and is particularly suitable for inverse problems where exact posterior sampling is too expensive. We present numerical experiments including multimodal and data-driven priors with nonlinear input-output operators. The results show that Proximal-IMH reliably outperforms existing IMH variants.
Paper Structure (35 sections, 3 theorems, 62 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 35 sections, 3 theorems, 62 equations, 13 figures, 4 tables, 1 algorithm.

Key Result

Theorem 3.1

Under assume:kl, let ${\bf F}_{ii}=s_i$ and $\widetilde{{\bf F}}_{ii}=\alpha_i s_i$ for $i\in[d]$. Define $\rho_i:=\alpha_i^2 s_i^2+\sigma^2/s_i^2+\sigma^2$ and $\zeta_i:=1/(\alpha_i^2 s_i^2+\sigma^2)^2$. Setting $\beta=\sigma^2$ in the Proximal proposal, the expected KL divergences are given by

Figures (13)

  • Figure 1: Sensitivity of the expected KL divergence of the Approx, Latent, and Proximal proposals (relative to the Exact posterior) for different noise levels, operator errors, observation ratios, and dimensions. Details of the experimental setup are given in \ref{['tab:kl-experiments']}.
  • Figure 2: Comparison of sampling performance in the bimodal test. Rows correspond to Tests I--III. From left to right, panels show Metropolis--Hastings acceptance rates for three IMH methods, convergence of the relative mean error for the IMH methods together with NUTS and MALA, and histograms of the projected samples $w^\top x$ for two levels of operator error. In the histogram panels, Approx-NUTS denotes samples drawn from the Approx posterior$\pi_a(x\mid y)$. For Proximal-IMH, we set $\beta=\sigma^2$. Results in the left panel are averaged over 5 independent trials for each method.
  • Figure 3: Operator discrepancy in the bimodal test. The quantities $\|\mathbf I-\widetilde{{\bf F}}^{-1}{\bf F}\|$ and $\|\mathbf I-{\bf K}^{-1}\|$ are reported for Latent-IMH and Proximal-IMH, respectively, where ${\bf K}$ is defined in \ref{['eq:K-linear']}. For Proximal-IMH, the hyperparameter is set to $\beta=\sigma^2$.
  • Figure 4: Effect of $\beta$ on Proximal-IMH in the bimodal test. Values in parentheses represent $\|{\bf A}-\widetilde{{\bf A}}\|/\|{\bf A}\|$. Blue curves show acceptance rates (left y-axis), and pink curves show relative mean errors after $10^5$ MH steps (right y-axis).
  • Figure 5: Ground truth, posterior mean and MAP estimates for the MNIST inverse problem in Tests IV--VII. The MAP estimates are obtained from different random initial guesses for the optimization problem. We observe that in several cases the reconstructed MAP point is wrong. This is due to the nonconvexity of solving for a MAP point.
  • ...and 8 more figures

Theorems & Definitions (7)

  • Theorem 3.1: Adapted from Proposition 3.3 in chen-biros26
  • Theorem 3.2: Mixing time for three IMH schemes.
  • proof : Proof sketch
  • proof : Proof of \ref{['thm:kl-simple']}
  • Theorem 1.1: Adapted from Theorem 4.3 in chen-biros26
  • proof
  • proof : Proof of \ref{['thm:mixtime']}