Table of Contents
Fetching ...

A mixing time bound for Gibbs sampling from log-smooth log-concave distributions

Neha S. Wadia

TL;DR

The paper addresses the nonasymptotic mixing time of Gibbs sampling for target distributions with density $\pi(x) \propto e^{-f(x)}$ where $f$ is $\mu$-strongly convex and $L$-smooth. It develops a conductance-based framework inside a high-probability mode-centered ball and an approximation theory on small cubes to prove a polynomial mixing time bound, showing $\tau(\gamma)$ scales as a universal constant times $κ^2 n^{7.5}$ augmented by logarithmic factors in $n$, $M$, and $γ$. The main contributions are a concrete $s$-conductance bound $\phi_s>\Psi/(40n)$ and a cube-based isoperimetric inequality that together yield a rigorous $O^{\star}(κ^2 n^{7.5} (\max\{1,\sqrt{(1/n)\log (2M/γ)}\})^2)\log(2M/γ)$-type mixing-time guarantee for warm-start Gibbs sampling on this class of distributions. This advances understanding of high-dimensional sampling by extending fast-mixing results previously known for uniform or less strongly log-concave targets to the more general, strongly log-concave, log-smooth setting, with implications for practical MCMC sampling in multivariate contexts.

Abstract

The Gibbs sampler, also known as the coordinate hit-and-run algorithm, is a Markov chain that is widely used to draw samples from probability distributions in arbitrary dimensions. At each iteration of the algorithm, a randomly selected coordinate is resampled from the distribution that results from conditioning on all the other coordinates. We study the behavior of the Gibbs sampler on the class of log-smooth and strongly log-concave target distributions supported on $\mathbb{R}^n$. Assuming the initial distribution is $M$-warm with respect to the target, we show that the Gibbs sampler requires at most $O^{\star}\left(κ^2 n^{7.5}\left(\max\left\{1,\sqrt{\frac{1}{n}\log \frac{2M}γ}\right\}\right)^2\right)$ steps to produce a sample with error no more than $γ$ in total variation distance from a distribution with condition number $κ$.

A mixing time bound for Gibbs sampling from log-smooth log-concave distributions

TL;DR

The paper addresses the nonasymptotic mixing time of Gibbs sampling for target distributions with density where is -strongly convex and -smooth. It develops a conductance-based framework inside a high-probability mode-centered ball and an approximation theory on small cubes to prove a polynomial mixing time bound, showing scales as a universal constant times augmented by logarithmic factors in , , and . The main contributions are a concrete -conductance bound and a cube-based isoperimetric inequality that together yield a rigorous -type mixing-time guarantee for warm-start Gibbs sampling on this class of distributions. This advances understanding of high-dimensional sampling by extending fast-mixing results previously known for uniform or less strongly log-concave targets to the more general, strongly log-concave, log-smooth setting, with implications for practical MCMC sampling in multivariate contexts.

Abstract

The Gibbs sampler, also known as the coordinate hit-and-run algorithm, is a Markov chain that is widely used to draw samples from probability distributions in arbitrary dimensions. At each iteration of the algorithm, a randomly selected coordinate is resampled from the distribution that results from conditioning on all the other coordinates. We study the behavior of the Gibbs sampler on the class of log-smooth and strongly log-concave target distributions supported on . Assuming the initial distribution is -warm with respect to the target, we show that the Gibbs sampler requires at most steps to produce a sample with error no more than in total variation distance from a distribution with condition number .

Paper Structure

This paper contains 17 sections, 8 theorems, 105 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1.1

Consider a probability distribution supported on $\mathbb{R}^n$ with density function proportional to $e^{-f(x)}$ where $f$ is $\mu$-strongly convex in $x$ and has $L$-Lipschitz gradients. Let $\kappa=L/\mu$. Let $\pi^{k}$ be the distribution of the $k^{th}$ iterate produced by the Gibbs sampler. Fo

Figures (4)

  • Figure 1: Illustrated here is a partition $A_1\cup A_2$ of $\mathbb{R}^2$ such that for some $s\in(0,1/2)$, $s<\Pi(A_1)\leq 1/2$. The Euclidean ball $K$ is centered at the mode $x^{\star}$ of $\pi$ and is large enough so that its measure differs from unity by a fraction of $s$. $A_1^{\prime}$ (in light pink) and $A_2^{\prime}$ (in light blue) are subsets of $A_1$ and $A_2$, respectively. $A_1^{\prime}$ and $A_2^{\prime}$ are axis disjoint. $S_1\cup S_2 \cup S_3$ is a partition of $K$ such that $S_i=K\cap A_i^{\prime}$ for $i=1,2$. $\Pi(S_1)\leq \Pi(S_2)$.
  • Figure 2: Pictured here is a cube of side $\delta$. $\beta$, shaded in light yellow, is a facet normal to the $e_3$ coordinate axis. $\beta_a$, shaded in gray, is the set that results from translating $\beta$ along the $e_3$ axis to $x_3=a$. $\omega$ is the subset of $\beta$ shaded in a darker yellow, and $\omega_a$ is the subset of $\beta_a$ shaded in a darker gray. Outlined in black is the extension $B$ of $\omega$ along $e_3$ in the cube.
  • Figure 3: The Euclidean ball $K$ is embedded in a grid of cubes of side $\delta$. Shaded in purple is the set of cubes in the bulk set $C_2$ covering $S_1$. In orange is the boundary set $C_1$.
  • Figure 4: The part of $\beta$ underneath the dashed line is $\mathcal{P}_{\beta}(S_1\cap v)$, the projection of $S_1\cap v$ onto $\beta$. The part of $u$ underneath the dashed line is $\mathcal{E}_u(\mathcal{P}_{\beta})$, the extension of $\mathcal{P}_{\beta}(S_1\cap v)$ in $u$ along the coordinate direction normal to $\beta$. The subset of this region shaded with lines is the precisely the part of $S_3\cap u$ that the dynamics can escape to through $\beta$ from $S_1\cap v$. A worst-case lower bound on its measure is derived in \ref{['eq:inter-bound-on-pi-S3-in-u']}.

Theorems & Definitions (15)

  • Theorem 1.1
  • Definition 1: Axis-disjoint sets
  • Lemma 1.2
  • Theorem 2.1
  • Lemma 3.1
  • Lemma 3.2
  • proof
  • proof : Proof of Theorem \ref{['thm:main-mixing-time']}
  • Definition 2: Axis-aligned cubes.
  • proof
  • ...and 5 more