Table of Contents
Fetching ...

Bernstein's inequalities for general Markov chains

Bai Jiang, Qiang Sun, Jianqing Fan

TL;DR

This work addresses Bernstein-type concentration for functions of general Markov chains, including non-reversible and general-state-space cases. By combining Léon-Perron convex majorization with Kato perturbation theory and a discretization strategy, the authors derive mgf bounds that split into an i.i.d.-like term and a dependence term, yielding explicit bounds $\mathbb{E}_\pi[e^{t\sum f_i(X_i)}] \le e^{n g(t;\lambda,c,\sigma^2)}$ with $g(t)=\frac{\sigma^2}{c^2}(e^{tc}-1-tc)+\frac{\sigma^2\lambda t^2}{1-\lambda-5ct}$ and corresponding tail bounds via $g^*(\epsilon)$. They extend these results to nonstationary starts and provide two practical applications: tighter non-asymptotic confidence intervals for MCMC integral estimation and a robust mean estimator under Markov dependence, both achieving sharp concentration with optimal variance proxies. The framework unifies Bernstein-type inequalities under dependence with independence as a special case and has direct implications for MCMC analysis and robust statistics with dependent data.

Abstract

We establish Bernstein's inequalities for functions of general (general-state-space and possibly non-reversible) Markov chains. These inequalities achieve sharp variance proxies and encompass the classical Bernstein inequality for independent random variables as special cases. The key analysis lies in bounding the operator norm of a perturbed Markov transition kernel by the exponential of sum of two convex functions. One coincides with what delivers the classical Bernstein inequality, and the other reflects the influence of the Markov dependence. A convex analysis on these two functions then derives our Bernstein inequalities. As applications, we apply our Bernstein inequalities to the Markov chain Monte Carlo integral estimation problem and the robust mean estimation problem with Markov-dependent samples, and achieve tight deviation bounds that previous inequalities can not.

Bernstein's inequalities for general Markov chains

TL;DR

This work addresses Bernstein-type concentration for functions of general Markov chains, including non-reversible and general-state-space cases. By combining Léon-Perron convex majorization with Kato perturbation theory and a discretization strategy, the authors derive mgf bounds that split into an i.i.d.-like term and a dependence term, yielding explicit bounds with and corresponding tail bounds via . They extend these results to nonstationary starts and provide two practical applications: tighter non-asymptotic confidence intervals for MCMC integral estimation and a robust mean estimator under Markov dependence, both achieving sharp concentration with optimal variance proxies. The framework unifies Bernstein-type inequalities under dependence with independence as a special case and has direct implications for MCMC analysis and robust statistics with dependent data.

Abstract

We establish Bernstein's inequalities for functions of general (general-state-space and possibly non-reversible) Markov chains. These inequalities achieve sharp variance proxies and encompass the classical Bernstein inequality for independent random variables as special cases. The key analysis lies in bounding the operator norm of a perturbed Markov transition kernel by the exponential of sum of two convex functions. One coincides with what delivers the classical Bernstein inequality, and the other reflects the influence of the Markov dependence. A convex analysis on these two functions then derives our Bernstein inequalities. As applications, we apply our Bernstein inequalities to the Markov chain Monte Carlo integral estimation problem and the robust mean estimation problem with Markov-dependent samples, and achieve tight deviation bounds that previous inequalities can not.

Paper Structure

This paper contains 14 sections, 15 theorems, 122 equations, 3 tables, 1 algorithm.

Key Result

Theorem 1

Suppose $\{X_i\}_{i \ge 1}$ is a stationary Markov chain with invariant distribution $\pi$ and absolute spectral gap $1-\lambda >0$, and $f_i$'s are functions with $|f_i| \le c$ and $\pi(f_i) = 0$. Let $\sigma^2 = \sum_{i=1}^n \pi(f_i^2) / n$. Then, for any $0 \le t < (1-\lambda)/5c$, where Moreover, for any $\epsilon > 0$, where $g^*(\epsilon; \lambda, c, \sigma^2)$ is convex conjugate of $g(t

Theorems & Definitions (28)

  • Definition 1: Absolute spectral gap
  • Definition 2: Right spectral gap
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 3: León-Perron operator
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • ...and 18 more