Table of Contents
Fetching ...

A coupling-based approach to f-divergences diagnostics for Markov chain Monte Carlo

Adrien Corenflos, Hai-Dang Dau

TL;DR

The paper tackles the gap between theoretical MCMC convergence metrics and practical diagnostics by introducing a coupling-based weight-harmonization framework that yields online, consistent importance weights for multiple chains, enabling upper bounds on any $f$-divergence $D_f(\pi\|\mu_t)$ (e.g., $KL$, $\chi^2$, Hellinger, TV). It proves consistency as the number of chains grows and, under strong coupling, exponential convergence of weights toward uniform, yielding decreasing divergence bounds over time and a practical diagnostic tool. Numerical experiments across Gaussian, Pólya–Gamma Gibbs, and MALA-based stochastic volatility models show competitive performance with existing diagnostics, while highlighting conservativeness in some regimes and the benefit of online applicability without lag or warm-up. The work points to enhancements via Rao–Blackwellization, offline smoothing, and variance-reduction techniques to further tighten bounds and broaden applicability.

Abstract

A long-standing gap exists between the theoretical analysis of Markov chain Monte Carlo convergence, which is often based on statistical divergences, and the diagnostics used in practice. We introduce the first general convergence diagnostics for Markov chain Monte Carlo based on any f-divergence, allowing users to directly monitor, among others, the Kullback--Leibler and the $χ^2$ divergences as well as the Hellinger and the total variation distances. Our first key contribution is a coupling-based `weight harmonization' scheme that produces a direct, computable, and consistent weighting of interacting Markov chains with respect to their target distribution. The second key contribution is to show how such consistent weightings of empirical measures can be used to provide upper bounds to f-divergences in general. We prove that these bounds are guaranteed to tighten over time and converge to zero as the chains approach stationarity, providing a concrete diagnostic. Numerical experiments demonstrate that our method is a practical and competitive diagnostic tool.

A coupling-based approach to f-divergences diagnostics for Markov chain Monte Carlo

TL;DR

The paper tackles the gap between theoretical MCMC convergence metrics and practical diagnostics by introducing a coupling-based weight-harmonization framework that yields online, consistent importance weights for multiple chains, enabling upper bounds on any -divergence (e.g., , , Hellinger, TV). It proves consistency as the number of chains grows and, under strong coupling, exponential convergence of weights toward uniform, yielding decreasing divergence bounds over time and a practical diagnostic tool. Numerical experiments across Gaussian, Pólya–Gamma Gibbs, and MALA-based stochastic volatility models show competitive performance with existing diagnostics, while highlighting conservativeness in some regimes and the benefit of online applicability without lag or warm-up. The work points to enhancements via Rao–Blackwellization, offline smoothing, and variance-reduction techniques to further tighten bounds and broaden applicability.

Abstract

A long-standing gap exists between the theoretical analysis of Markov chain Monte Carlo convergence, which is often based on statistical divergences, and the diagnostics used in practice. We introduce the first general convergence diagnostics for Markov chain Monte Carlo based on any f-divergence, allowing users to directly monitor, among others, the Kullback--Leibler and the divergences as well as the Hellinger and the total variation distances. Our first key contribution is a coupling-based `weight harmonization' scheme that produces a direct, computable, and consistent weighting of interacting Markov chains with respect to their target distribution. The second key contribution is to show how such consistent weightings of empirical measures can be used to provide upper bounds to f-divergences in general. We prove that these bounds are guaranteed to tighten over time and converge to zero as the chains approach stationarity, providing a concrete diagnostic. Numerical experiments demonstrate that our method is a practical and competitive diagnostic tool.

Paper Structure

This paper contains 45 sections, 16 theorems, 90 equations, 8 figures, 6 algorithms.

Key Result

Lemma 1

Let $W^1, \ldots, W^N$ be $N$ non-negative real numbers such that $\sum_{n=1}^N W^n = 1$. Then, for any $N$ elements $X^1, \ldots, X^N$ in an arbitrary space $\mathcal{X}$, In particular, for the chi-squared distance where $f(t) = (t-1)^2$,

Figures (8)

  • Figure 1: Step of Algorithm \ref{['alg:weight-harmonization']} for $2N = 4$ particles and successful couplings: $X_{t+1}^1 = X_{t+1}^3$ and $X_{t+1}^2 = X_{t+1}^4$.
  • Figure 2: Comparison of theoretical \ref{['eq:ess-theo']} (full, black) and empirical \ref{['eq:ess-is']} (different dashes correspond to different number of particles, gray) effective sample size measured for $\mu_t$.
  • Figure 3: Total variation comparison (left) and ESS (right) profile for the Pólya--Gamma Gibbs sample applied to the Credit dataset logistic regression. Dashed curves correspond to our bound, while the full line corresponds to biswas2019estimating.
  • Figure 4: Left: total variation upper bounds of biswas2019estimating and our harmonization procedure (dashed). Right: harmonized upper bounds for the Hellinger distance (full), the $\operatorname{KL}$ divergence (dashed) and the $\chi^2$ distance (dotted).
  • Figure 5: Total variation monitoring of the chains, the organization is the same as in Figure \ref{['fig:ess-gaussian']} except that no theoretical line is shown.
  • ...and 3 more figures

Theorems & Definitions (37)

  • Definition 1
  • Example 1
  • Lemma 1
  • Theorem 1
  • Remark 1
  • Remark 2
  • Proposition 1
  • Remark 3
  • Proposition 2: Invariance of expectations under Algorithm \ref{['alg:weight-harmonization']}
  • Theorem 2: Consistency
  • ...and 27 more