Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

Siddharth Mitra; Andre Wibisono

Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

Siddharth Mitra, Andre Wibisono

TL;DR

The paper analyzes the mixing times of two Langevin-based samplers, the Unadjusted Langevin Algorithm (ULA) and the Proximal Sampler, in the general framework of $\Phi$-divergences. It develops a Strong Data Processing Inequality (SDPI) based approach combined with $\Phi$-Sobolev inequalities to obtain exponential decay bounds for $\mathsf{D}_{\Phi}$, valid whenever the stationary distribution satisfies a $\Phi$-SI. For ULA, the results show exponential convergence to the biased limit $\nu^{\eta}$ with rate $\left(1+\frac{2\alpha\eta}{(1+\eta L)^2}\right)^{-k}$ under $L$-smoothness and $\Phi$-SI of $\nu^{\eta}$; for the Proximal Sampler, the convergences hold to the target $\nu^X$ with rate $\left(1+\alpha\eta\right)^{-2k}$, assuming $\nu^X$ satisfies $\Phi$-SI. The work unifies and extends known KL, chi-square, and entropy results to the full class of twice-differentiable $\Phi$-divergences and provides tightness in the KL case via Gaussian examples, with practical implications for sampling accuracy and algorithm design.

Abstract

We study the mixing time of two popular discrete-time Markov chains in continuous space, the Unadjusted Langevin Algorithm and the Proximal Sampler, which are discretizations of the Langevin dynamics. We extend mixing time analyses for these Markov chains to hold in $Φ$-divergence. We show that any $Φ$-divergence arising from a twice-differentiable strictly convex function $Φ$ converges to $0$ exponentially fast along these Markov chains, under the assumption that their stationary distributions satisfy the corresponding $Φ$-Sobolev inequality, which holds for example when the target distribution of the Langevin dynamics is strongly log-concave. Our setting includes as special cases popular mixing time regimes, namely the mixing in chi-squared divergence under a Poincaré inequality, and the mixing in relative entropy under a log-Sobolev inequality. Our results follow by viewing the sampling algorithms as noisy channels and bounding the contraction coefficients arising in the appropriate strong data processing inequalities.

Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

TL;DR

The paper analyzes the mixing times of two Langevin-based samplers, the Unadjusted Langevin Algorithm (ULA) and the Proximal Sampler, in the general framework of

-divergences. It develops a Strong Data Processing Inequality (SDPI) based approach combined with

-Sobolev inequalities to obtain exponential decay bounds for

, valid whenever the stationary distribution satisfies a

-SI. For ULA, the results show exponential convergence to the biased limit

with rate

under

-smoothness and

-SI of

; for the Proximal Sampler, the convergences hold to the target

with rate

, assuming

satisfies

-SI. The work unifies and extends known KL, chi-square, and entropy results to the full class of twice-differentiable

-divergences and provides tightness in the KL case via Gaussian examples, with practical implications for sampling accuracy and algorithm design.

Abstract

-divergence. We show that any

-divergence arising from a twice-differentiable strictly convex function

converges to

exponentially fast along these Markov chains, under the assumption that their stationary distributions satisfy the corresponding

-Sobolev inequality, which holds for example when the target distribution of the Langevin dynamics is strongly log-concave. Our setting includes as special cases popular mixing time regimes, namely the mixing in chi-squared divergence under a Poincaré inequality, and the mixing in relative entropy under a log-Sobolev inequality. Our results follow by viewing the sampling algorithms as noisy channels and bounding the contraction coefficients arising in the appropriate strong data processing inequalities.

Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

TL;DR

Abstract

Fast Convergence of $Φ$-Divergence Along the Unadjusted Langevin Algorithm and Proximal Sampler

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (29)