Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

R. Srikant

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

R. Srikant

TL;DR

This work delivers non-asymptotic central limit theorems for vector-valued martingales and for functions of Markov chains by combining Lindeberg-style decompositions, Stein's method, and Poisson’s equation. It provides explicit Wasserstein-distance rates that quantify how fast the distributions of normalized sums approach their Gaussian limits and extends these results to Markov chains with both finite and general state spaces. The authors then apply the Markov-chain CLT to Temporal Difference learning with Polyak-Ruppert averaging, deriving a concrete rate for the distributional convergence of the averaged TD iterates under decaying step-sizes. The results offer practical, finite-time normal approximation guarantees for TD learning and potentially other stochastic approximation schemes with Markovian noise, connecting asymptotic variance characterizations to finite-sample performance.

Abstract

We prove a non-asymptotic central limit theorem for vector-valued martingale differences using Stein's method, and use Poisson's equation to extend the result to functions of Markov Chains. We then show that these results can be applied to establish a non-asymptotic central limit theorem for Temporal Difference (TD) learning with averaging.

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

TL;DR

Abstract

Paper Structure (21 sections, 6 theorems, 77 equations)

This paper contains 21 sections, 6 theorems, 77 equations.

Introduction
Martingale Central Limit Theorem
Preliminaries
Rate of Convergence
Markov Chain Central Limit Theorem
Finite State Space Markov Chains
Extension to General State-Space Markov Chains
Proof of (1):
Proof of (2):
Proof of (3):
Proof of (4):
An Application to TD Learning
Step 1: Equation (\ref{['eq1']}).
Step 2: Expression (\ref{['eq2']}).
Step 3: Expression (\ref{['step3']}).
...and 6 more sections

Key Result

Lemma 1

Let $g\in Lip_L$ and $f$ be the solution to Stein's equation: where $\Delta$ and $\nabla$ are the Laplacian and gradient operators, respectively, and $Z\sim N(0,I)$. Then, $f$ has the following regularity properties for any $\beta\in (0,1)$: $\diamond$

Theorems & Definitions (10)

Lemma 1
Lemma 2
Theorem 1
proof
Theorem 2
proof
Theorem 3
proof
Theorem 4
proof

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

TL;DR

Abstract

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (10)