Entropy contraction of the Gibbs sampler under log-concavity

Filippo Ascolani; Hugo Lavenant; Giacomo Zanella

Entropy contraction of the Gibbs sampler under log-concavity

Filippo Ascolani, Hugo Lavenant, Giacomo Zanella

TL;DR

This work delivers explicit, dimension-free convergence guarantees for the random-scan Gibbs sampler targeting log-concave distributions on product spaces. By leveraging a novel approximate tensorization of entropy via Knothe–Rosenblatt triangular transport, the authors obtain a sharp KL contraction bound with rate $1/(\kappa^* M)$, with $\kappa^*$ a coordinate-wise condition number, and show linear mixing-time scaling in $M$ and $\kappa^*$ that is independent of ambient dimension. The analysis extends to Metropolis-within-Gibbs and Hit-and-Run, and provides a clear comparison with gradient-based MCMC schemes and optimization. In the non-strongly convex regime, convergence becomes polynomial, but warm-start strategies recover practical guarantees. Overall, the results illuminate when coordinate-wise sampling yields dimension-free efficiency and connect sampling convergence to coordinate-wise convexity and transport inequalities.

Abstract

The Gibbs sampler (a.k.a. Glauber dynamics and heat-bath algorithm) is a popular Markov Chain Monte Carlo algorithm which iteratively samples from the conditional distributions of a probability measure $π$ of interest. Under the assumption that $π$ is strongly log-concave, we show that the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate. Assuming that evaluating conditionals is cheap compared to evaluating the joint density, our results imply that the number of full evaluations of $π$ needed for the Gibbs sampler to mix grows linearly with the condition number and is independent of the dimension. If $π$ is non-strongly log-concave, the convergence rate in entropy degrades from exponential to polynomial. Our techniques are versatile and extend to Metropolis-within-Gibbs schemes and the Hit-and-Run algorithm. A comparison with gradient-based schemes and the connection with the optimization literature are also discussed.

Entropy contraction of the Gibbs sampler under log-concavity

TL;DR

, with

a coordinate-wise condition number, and show linear mixing-time scaling in

and

that is independent of ambient dimension. The analysis extends to Metropolis-within-Gibbs and Hit-and-Run, and provides a clear comparison with gradient-based MCMC schemes and optimization. In the non-strongly convex regime, convergence becomes polynomial, but warm-start strategies recover practical guarantees. Overall, the results illuminate when coordinate-wise sampling yields dimension-free efficiency and connect sampling convergence to coordinate-wise convexity and transport inequalities.

Abstract

of interest. Under the assumption that

is strongly log-concave, we show that the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate. Assuming that evaluating conditionals is cheap compared to evaluating the joint density, our results imply that the number of full evaluations of

needed for the Gibbs sampler to mix grows linearly with the condition number and is independent of the dimension. If

is non-strongly log-concave, the convergence rate in entropy degrades from exponential to polynomial. Our techniques are versatile and extend to Metropolis-within-Gibbs schemes and the Hit-and-Run algorithm. A comparison with gradient-based schemes and the connection with the optimization literature are also discussed.

Paper Structure (25 sections, 29 theorems, 109 equations, 1 table, 2 algorithms)

This paper contains 25 sections, 29 theorems, 109 equations, 1 table, 2 algorithms.

Introduction
Related works
Notation and assumptions
The Gibbs sampler Markov kernel
Assumptions on the potential
Main result: entropy contraction of the Gibbs sampler
Implications for mixing time
Related functional inequalities
Tightness of the entropy contraction rate
Computational considerations
Implementation of GS
Comparison with gradient-based MCMC schemes
Comparison with optimization: similarities and differences
Proof of the main result
Proof strategy
...and 10 more sections

Key Result

Lemma 2.1

For any $m=1,\ldots, M$ the following holds.

Theorems & Definitions (66)

Remark 1.1
Lemma 2.1
Lemma 2.2: Variational characterization
Remark 2.3: Invariance of GS under coordinate-wise transformations
Lemma 2.4
proof
Remark 2.5: Checking the assumptions for $C^2$ potentials
Remark 2.6: Separable potential with a quadratic interaction term
Theorem 3.1
Theorem 3.2
...and 56 more

Entropy contraction of the Gibbs sampler under log-concavity

TL;DR

Abstract

Entropy contraction of the Gibbs sampler under log-concavity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (66)