Table of Contents
Fetching ...

Extended Laplace Principle for Empirical Measures of a Markov Chain

Stephan Eckstein

TL;DR

This work extends the Laplace principle for empirical measures of Markov chains on Polish spaces to a broad class of convex dual pairs via a β–ρ duality framework, building on the weak convergence approach of Dupuis–Ellis. The main result provides matched upper and lower large deviations bounds for empirical measures under general assumptions, recovering the classical i.i.d. setting when β reduces to the relative entropy. A primary application develops a robust Markov-chain theory, where transition uncertainty is modeled by Wasserstein neighborhoods, yielding robust large deviations and robust weak laws of large numbers with explicit rate functions $I$ and $\underline{I}$. The approach highlights how convex duality and measurable selection translate distributional uncertainty into tractable variational characterizations, enabling worst-case analysis in Markovian settings. Overall, the paper bridges convex-analytic methods and stochastic-perturbation analysis to quantify robustness in large deviations and limit theorems for Markov chains.

Abstract

We consider discrete time Markov chains with Polish state space. The large deviations principle for empirical measures of a Markov chain can equivalently be stated in Laplace principle form, which builds on the convex dual pair of relative entropy (or Kullback-Leibler divergence) and cumulant generating functional $f\mapsto \ln \int \exp(f)$. Following the approach by Lacker in the i.i.d. case, we generalize the Laplace principle to a greater class of convex dual pairs. We present in depth one application arising from this extension, which includes large deviations results and a weak law of large numbers for certain robust Markov chains - similar to Markov set chains - where we model robustness via the first Wasserstein distance. The setting and proof of the extended Laplace principle are based on the weak convergence approach to large deviations by Dupuis and Ellis.

Extended Laplace Principle for Empirical Measures of a Markov Chain

TL;DR

This work extends the Laplace principle for empirical measures of Markov chains on Polish spaces to a broad class of convex dual pairs via a β–ρ duality framework, building on the weak convergence approach of Dupuis–Ellis. The main result provides matched upper and lower large deviations bounds for empirical measures under general assumptions, recovering the classical i.i.d. setting when β reduces to the relative entropy. A primary application develops a robust Markov-chain theory, where transition uncertainty is modeled by Wasserstein neighborhoods, yielding robust large deviations and robust weak laws of large numbers with explicit rate functions and . The approach highlights how convex duality and measurable selection translate distributional uncertainty into tractable variational characterizations, enabling worst-case analysis in Markovian settings. Overall, the paper bridges convex-analytic methods and stochastic-perturbation analysis to quantify robustness in large deviations and limit theorems for Markov chains.

Abstract

We consider discrete time Markov chains with Polish state space. The large deviations principle for empirical measures of a Markov chain can equivalently be stated in Laplace principle form, which builds on the convex dual pair of relative entropy (or Kullback-Leibler divergence) and cumulant generating functional . Following the approach by Lacker in the i.i.d. case, we generalize the Laplace principle to a greater class of convex dual pairs. We present in depth one application arising from this extension, which includes large deviations results and a weak law of large numbers for certain robust Markov chains - similar to Markov set chains - where we model robustness via the first Wasserstein distance. The setting and proof of the extended Laplace principle are based on the weak convergence approach to large deviations by Dupuis and Ellis.

Paper Structure

This paper contains 16 sections, 20 theorems, 150 equations, 1 figure.

Key Result

Theorem 1.1

Define the rate function $I: \mathcal{P}(E) \rightarrow (-\infty,\infty]$ by Under condition (B.1), (B.2) and (T), the upper bound holds for all upper semi-continuous and bounded functions $F : \mathcal{P}(E) \rightarrow \mathbb{R}$. Under condition (M.1), (M.2), (B.1) and (B.3), the lower bound holds for all $F \in C_b(\mathcal{P}(E))$.

Figures (1)

  • Figure 1: Illustration of convergence rates, simulated (100 paths) realized convergence and the stationary distributions under the normal Markov chain and the robust worst-case Markov chain.

Theorems & Definitions (37)

  • Theorem 1.1
  • Corollary 1.2
  • Theorem 1.3
  • Theorem 1.4
  • Lemma 2.1
  • proof
  • Lemma 2.2
  • proof
  • Lemma 2.3
  • proof
  • ...and 27 more