Table of Contents
Fetching ...

The shifted ODE method for underdamped Langevin MCMC

James Foster, Terry Lyons, Harald Oberhauser

TL;DR

This work introduces the shifted ODE method as a high-order, derivative-free approximation to underdamped Langevin diffusion for MCMC sampling. By recasting the SDE as a controlled differential equation driven by a Brownian-bridge-based path, the authors derive two practical high-order discretizations, SORT and SOFA, which require limited gradient evaluations per step. They establish non-asymptotic 2-Wasserstein error bounds that improve with additional smoothness of the target density, and demonstrate favorable empirical performance on logistic regression tasks. The approach offers a pathway to faster, high-accuracy unadjusted Langevin MCMC by leveraging ODE solvers while maintaining tractable computational costs.

Abstract

In this paper, we consider the underdamped Langevin diffusion (ULD) and propose a numerical approximation using its associated ordinary differential equation (ODE). When used as a Markov Chain Monte Carlo (MCMC) algorithm, we show that the ODE approximation achieves a $2$-Wasserstein error of $\varepsilon$ in $\mathcal{O}\big(d^{\frac{1}{3}}/\varepsilon^{\frac{2}{3}}\big)$ steps under the standard smoothness and strong convexity assumptions on the target distribution. This matches the complexity of the randomized midpoint method proposed by Shen and Lee [NeurIPS 2019] which was shown to be order optimal by Cao, Lu and Wang. However, the main feature of the proposed numerical method is that it can utilize additional smoothness of the target log-density $f$. More concretely, we show that the ODE approximation achieves a $2$-Wasserstein error of $\varepsilon$ in $\mathcal{O}\big(d^{\frac{2}{5}}/\varepsilon^{\frac{2}{5}}\big)$ and $\mathcal{O}\big(\sqrt{d}/\varepsilon^{\frac{1}{3}}\big)$ steps when Lipschitz continuity is assumed for the Hessian and third derivative of $f$. By discretizing this ODE using a third order Runge-Kutta method, we can obtain a practical MCMC method that uses just two additional gradient evaluations per step. In our experiment, where the target comes from a logistic regression, this method shows faster convergence compared to other unadjusted Langevin MCMC algorithms.

The shifted ODE method for underdamped Langevin MCMC

TL;DR

This work introduces the shifted ODE method as a high-order, derivative-free approximation to underdamped Langevin diffusion for MCMC sampling. By recasting the SDE as a controlled differential equation driven by a Brownian-bridge-based path, the authors derive two practical high-order discretizations, SORT and SOFA, which require limited gradient evaluations per step. They establish non-asymptotic 2-Wasserstein error bounds that improve with additional smoothness of the target density, and demonstrate favorable empirical performance on logistic regression tasks. The approach offers a pathway to faster, high-accuracy unadjusted Langevin MCMC by leveraging ODE solvers while maintaining tractable computational costs.

Abstract

In this paper, we consider the underdamped Langevin diffusion (ULD) and propose a numerical approximation using its associated ordinary differential equation (ODE). When used as a Markov Chain Monte Carlo (MCMC) algorithm, we show that the ODE approximation achieves a -Wasserstein error of in steps under the standard smoothness and strong convexity assumptions on the target distribution. This matches the complexity of the randomized midpoint method proposed by Shen and Lee [NeurIPS 2019] which was shown to be order optimal by Cao, Lu and Wang. However, the main feature of the proposed numerical method is that it can utilize additional smoothness of the target log-density . More concretely, we show that the ODE approximation achieves a -Wasserstein error of in and steps when Lipschitz continuity is assumed for the Hessian and third derivative of . By discretizing this ODE using a third order Runge-Kutta method, we can obtain a practical MCMC method that uses just two additional gradient evaluations per step. In our experiment, where the target comes from a logistic regression, this method shows faster convergence compared to other unadjusted Langevin MCMC algorithms.

Paper Structure

This paper contains 25 sections, 46 theorems, 247 equations, 3 figures.

Key Result

Theorem 3.1

Consider the SDE (eq:ULD) and suppose that the potential $f$ is three times continuously differentiable. Then for $0\leq s\leq t$, where $h = t - s$ and the remainder terms $R^{\space x}(h, x_{s}, v_{s})$, $R^{\space v}(h, x_{s}, v_{s})$ are given by

Figures (3)

  • Figure 3.1: Brownian motion approximated using piecewise linear paths with vertical pieces.
  • Figure 3.2: Graph outlining the general strategy for our error analysis.
  • Figure 5.1: Graph showing $S_{N,n}$ computed for various numerical methods and step sizes $h = \frac{T}{N}\space$.

Theorems & Definitions (113)

  • Definition 1.1: Shifted ODE method
  • Definition 1.2: The SORT method
  • Definition 1.3: The SOFA method
  • Theorem 3.1: High order Taylor expansion of ULD
  • proof
  • Remark 3.2
  • Theorem 3.3: Taylor expansion of underdamped Langevin CDE
  • Theorem 3.4
  • proof
  • Theorem 3.5
  • ...and 103 more