Table of Contents
Fetching ...

Sampling and estimation on manifolds using the Langevin diffusion

Karthik Bharath, Alexander Lewis, Akash Sharma, Michael V Tretyakov

TL;DR

<3-5 sentence high-level summary> The paper addresses sampling from probability measures on manifolds by constructing intrinsic Langevin diffusions and discretizations that stay on the manifold. It develops two estimators (ensemble-averaging and time-averaging) and proves first-order weak error bounds by leveraging backward Kolmogorov and Poisson PDEs, showing the discretization error matches the Euclidean rate and yields a bound on the distance to the invariant measure. The authors extend the Euclidean weak-convergence framework to compact manifolds, discuss non-compact extensions, and demonstrate practical performance through numerical experiments on the sphere and the manifold of SPD matrices, including non-convex potentials. The work provides geometry-preserving sampling tools with rigorous error control and a pathway to broader Langevin-based sampling methods on manifolds.

Abstract

Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure $\text{d}μ_φ\propto e^{-φ} \mathrm{dvol}_g $ on a compact Riemannian manifold. Two estimators of linear functionals of $μ_φ$ based on the discretized Markov process are considered: a time-averaging estimator based on a single trajectory and an ensemble-averaging estimator based on multiple independent trajectories. Imposing no restrictions beyond a nominal level of smoothness on $φ$, first-order error bounds, in discretization step size, on the bias and variance/mean-square error of both estimators are derived. The order of error matches the optimal rate in Euclidean and flat spaces, and leads to a first-order bound on distance between the invariant measure $μ_φ$ and a stationary measure of the discretized Markov process. This order is preserved even upon using retractions when exponential maps are unavailable in closed form, thus enhancing practicality of the proposed algorithms. Generality of the proof techniques, which exploit links between two partial differential equations and the semigroup of operators corresponding to the Langevin diffusion, renders them amenable for the study of a more general class of sampling algorithms related to the Langevin diffusion. Conditions for extending analysis to the case of non-compact manifolds are discussed. Numerical illustrations with distributions, log-concave and otherwise, on the manifolds of positive and negative curvature elucidate on the derived bounds and demonstrate practical utility of the sampling algorithm.

Sampling and estimation on manifolds using the Langevin diffusion

TL;DR

<3-5 sentence high-level summary> The paper addresses sampling from probability measures on manifolds by constructing intrinsic Langevin diffusions and discretizations that stay on the manifold. It develops two estimators (ensemble-averaging and time-averaging) and proves first-order weak error bounds by leveraging backward Kolmogorov and Poisson PDEs, showing the discretization error matches the Euclidean rate and yields a bound on the distance to the invariant measure. The authors extend the Euclidean weak-convergence framework to compact manifolds, discuss non-compact extensions, and demonstrate practical performance through numerical experiments on the sphere and the manifold of SPD matrices, including non-convex potentials. The work provides geometry-preserving sampling tools with rigorous error control and a pathway to broader Langevin-based sampling methods on manifolds.

Abstract

Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion with invariant measure on a compact Riemannian manifold. Two estimators of linear functionals of based on the discretized Markov process are considered: a time-averaging estimator based on a single trajectory and an ensemble-averaging estimator based on multiple independent trajectories. Imposing no restrictions beyond a nominal level of smoothness on , first-order error bounds, in discretization step size, on the bias and variance/mean-square error of both estimators are derived. The order of error matches the optimal rate in Euclidean and flat spaces, and leads to a first-order bound on distance between the invariant measure and a stationary measure of the discretized Markov process. This order is preserved even upon using retractions when exponential maps are unavailable in closed form, thus enhancing practicality of the proposed algorithms. Generality of the proof techniques, which exploit links between two partial differential equations and the semigroup of operators corresponding to the Langevin diffusion, renders them amenable for the study of a more general class of sampling algorithms related to the Langevin diffusion. Conditions for extending analysis to the case of non-compact manifolds are discussed. Numerical illustrations with distributions, log-concave and otherwise, on the manifolds of positive and negative curvature elucidate on the derived bounds and demonstrate practical utility of the sampling algorithm.
Paper Structure (31 sections, 5 theorems, 132 equations, 3 figures, 4 tables, 3 algorithms)

This paper contains 31 sections, 5 theorems, 132 equations, 3 figures, 4 tables, 3 algorithms.

Key Result

Theorem 4.1

Under Assumptions assump:g and assump:phi and $\varphi \in C^{4,\epsilon}(M)$, the following bound holds for the the Riemannian Langevin algorithm with retraction eqn_3.5 satisfying Assumption assump:2nd order ret: where $C, \lambda$ are positive constants independent of $h$ and $T$ and the constant $C$ linearly depends on $\Vert \varphi \Vert_{C^{4,\epsilon }(M)}$ but otherwise $C, \lambda$ are

Figures (3)

  • Figure 1: Sampling from a von-Mises Fisher distribution. A log-log plot of the error $\texttt{err}$ against $h$. The black (blue) line corresponds to the Algorithm \ref{['algorithm2.1']} with $\xi_n$ distributed according to \ref{['eq:xi']} (standard Gaussian distribution). Error bars correspond to Monte Carlo error $\texttt{MCerr}$. The reference red line has gradient 1. The parameters are as in Table \ref{['table:vmf']}.
  • Figure 2: Sampling from the Riemannian–-Gaussian distribution on $\mathcal{P}_3$: log-log plot of the estimation error $\texttt{err}$ against $h$. Error bars correspond to Monte Carlo error MCerr. The reference red line has gradient 1. The parameters for the Riemannian-Gaussian are as in Table \ref{['table:rg']}.
  • Figure 3: Sampling from a distribution with non-convex 'double-well' potential on $\mathcal{P}_3$: log-log plot of the estimation error $\texttt{err}$ against $h$. Error bars correspond to Monte Carlo error MCerr. The reference red line has gradient 1. The parameters are as in Table \ref{['table:double-well']}.

Theorems & Definitions (14)

  • Definition 2.1
  • Example 2.1
  • Example 2.2
  • Theorem 4.1: Bias of the ensemble averaging estimator
  • Theorem 4.2: Bias and mean-square error of the time-averaging estimator
  • Theorem 4.3: Distance to invariant measure
  • Remark 4.1
  • Theorem 4.4: Finite-time convergence
  • Remark 4.2
  • Remark 4.3
  • ...and 4 more