Theoretical guarantees for neural control variates in MCMC

Denis Belomestny; Artur Goldman; Alexey Naumov; Sergey Samsonov

Theoretical guarantees for neural control variates in MCMC

Denis Belomestny, Artur Goldman, Alexey Naumov, Sergey Samsonov

TL;DR

A variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance is proposed.

Abstract

In this paper, we propose a variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance. We focus on the particular case when control variates are represented as deep neural networks. We derive the optimal convergence rate of the asymptotic variance under various ergodicity assumptions on the underlying Markov chain. The proposed approach relies upon recent results on the stochastic errors of variance reduction algorithms and function approximation theory.

Theoretical guarantees for neural control variates in MCMC

TL;DR

A variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance is proposed.

Abstract

Paper Structure (21 sections, 19 theorems, 166 equations, 3 figures, 11 tables, 2 algorithms)

This paper contains 21 sections, 19 theorems, 166 equations, 3 figures, 11 tables, 2 algorithms.

Introduction
Notations and definitions
ESVM algorithm and control variates
Assumptions
Markov Chain assumptions
VR rates with ESVM procedure
Numerical study
Implementation specifics
Funnel distribution
Banana-shaped distribution
Logistic regression
Conclusion
Acknowledgements
Proofs
Proof of Theorem \ref{['th:bound_approx']}
...and 6 more sections

Key Result

Theorem 1

Assume assu:AUF, assu:ge, and assu:br. Then for any $x_0 \in \mathsf{S}$ and $\delta \in (0,1)$ there exists $n_0=n_0(\delta, R_0, d, \beta, \mathcal{D})>0$ such that for all $n\geqslant n_0$, $n\in\mathbb{N}$ by setting $R=\log n$, $K = n^{\frac{1}{2\beta + d}}$, $b_n=2(\log(1/\rho))^{-1}\log(n)$, where $C_{th:bound_approx, 1}=C_{th:bound_approx, 1}(\beta, \mathcal{D})$ is independent of the pro

Figures (3)

Figure 1: Funnel distribution
Figure 2: Banana-shaped
Figure 3: Logistic regression, Pima dataset

Theorems & Definitions (34)

Theorem 1
Lemma 2
proof
Lemma 3
proof
Lemma 4: belomestny_variance_2020_esvm
Lemma 5
proof
Lemma 6
proof
...and 24 more

Theoretical guarantees for neural control variates in MCMC

TL;DR

Abstract

Theoretical guarantees for neural control variates in MCMC

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (34)