Concentration of the Langevin Algorithm's Stationary Distribution
Jason M. Altschuler, Kunal Talwar
TL;DR
Key to the analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm, and it is shown that for any nontrivial stepsize $\eta>0$, $\pi_{\eta}$ is sub-exponential when the potential is convex.
Abstract
A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ which differs from the stationary distribution $π$ of the Langevin Diffusion, and it is an important challenge to understand whether the well-known properties of $π$ extend to $π_η$. In particular, while concentration properties such as isoperimetry and rapidly decaying tails are classically known for $π$, the analogous properties for $π_η$ are open questions with algorithmic implications. This note provides a first step in this direction by establishing concentration results for $π_η$ that mirror classical results for $π$. Specifically, we show that for any nontrivial stepsize $η> 0$, $π_η$ is sub-exponential (respectively, sub-Gaussian) when the potential is convex (respectively, strongly convex). Moreover, the concentration bounds we show are essentially tight. We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm, and to inexact implementations which use sub-Gaussian estimates of the gradient. Key to our analysis is the use of a rotation-invariant moment generating function (aka Bessel function) to study the stationary dynamics of the Langevin Algorithm. This technique may be of independent interest because it enables directly analyzing the discrete-time stationary distribution $π_η$ without going through the continuous-time stationary distribution $π$ as an intermediary.
