Concentration of the Langevin Algorithm's Stationary Distribution

Jason M. Altschuler; Kunal Talwar

Concentration of the Langevin Algorithm's Stationary Distribution

Jason M. Altschuler, Kunal Talwar

TL;DR

Key to the analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm, and it is shown that for any nontrivial stepsize $\eta>0$, $\pi_{\eta}$ is sub-exponential when the potential is convex.

Abstract

A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ which differs from the stationary distribution $π$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of $π$ extend to $π_η$. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for $π$, the analogous properties for $π_η$ are open questions with algorithmic implications. This note provides a first step in this direction by establishing concentration results for $π_η$ that mirror classical results for $π$. Specifically, we show that for any nontrivial stepsize $η> 0$, $π_η$ is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm, and to inexact implementations which use sub-Gaussian estimates of the gradient. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution $π_η$ without going through the continuous-time stationary distribution $π$ as an intermediary.

Concentration of the Langevin Algorithm's Stationary Distribution

TL;DR

is sub-exponential when the potential is convex.

Abstract

A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize

. This discretization leads the Langevin Algorithm to have a stationary distribution

which differs from the stationary distribution

of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of

extend to

. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for

, the analogous properties for

are open questions with algorithmic implications. This note provides a first step in this direction by establishing concentration results for

that mirror classical results for

. Specifically, we show that for any nontrivial stepsize

is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm, and to inexact implementations which use sub-Gaussian estimates of the gradient. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution

without going through the continuous-time stationary distribution

as an intermediary.

Paper Structure (15 sections, 12 theorems, 46 equations)

This paper contains 15 sections, 12 theorems, 46 equations.

Introduction
Preliminaries
Lyapunov function
Relation to rotation-invariant MGF
Explicit expression via Bessel functions
Properties of the Lyapunov function
Sub-Gaussian concentration for strongly convex potentials
Proof
Tightness
Sub-exponential concentration for convex potentials
Proof
Tightness
Further extensions
Concentration along the trajectory
Using inexact gradients

Key Result

Lemma 3.2

For any dimension $d \geqslant 2$ and argument $z > 0$, where $\alpha := (d-2)/2$. (In dimension $d=1$, we simply have $\phi_d(z) = \cosh(z)$.)

Theorems & Definitions (25)

Definition 3.1: Lyapunov function
Lemma 3.2: Explicit formula for Lyapunov function
proof
Lemma 3.3: Behavior of $\Phi$ under Gaussian convolution
proof
Lemma 3.4: Properties of rotation-invariant MGF
proof
Theorem 4.1: Sub-Gaussianity of $\pi_{\eta}$ for strongly convex potentials
Lemma 4.2: Contractivity of gradient descent step
proof : Proof of Theorem \ref{['thm:sc']}
...and 15 more

Concentration of the Langevin Algorithm's Stationary Distribution

TL;DR

Abstract

Concentration of the Langevin Algorithm's Stationary Distribution

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (25)