Robust estimation of a Markov chain transition matrix from multiple sample paths
Lasse Leskelä, Maximilien Dreveton
TL;DR
We address the problem of estimating a Markov chain transition matrix $P$ and stationary distribution $\\pi$ from $M$ independent sample paths of length $T$ each, allowing heterogeneous per-path dynamics $P_m$, initial laws $\\mu_m$, and stationary laws $\\pi_m$ with pseudo-spectral gaps $\\gamma_m$. The authors develop a general framework based on ensemble-time averages and derive sharp nonasymptotic Bernstein-type concentration bounds for the empirical transition matrix $\\hat{P}$ and stationary distribution $\\hat{\\pi}$, with explicit dependence on heterogeneity measures $\\Delta_1$, $\\Delta_\\infty$, and $\\eta$, and an effective mixing time $T'$; they also address robust estimation under fully corrupted rows and provide large-scale consistency results. The main contributions include (i) extending concentration results from single-path Markov chains to ensembles of heterogeneous chains, (ii) giving robust bounds that tolerate a fraction of corrupted trajectories, and (iii) establishing consistency in high-dimensional regimes where the number of states, chains, and path lengths may grow. The results yield practical guidance for inference in heterogeneous Markov models encountered in cohort studies, distributed systems, and temporal networks, and they are complemented by numerical experiments that illustrate the trade-offs between number of chains $M$ and trajectory length $T$ under various noise and mixing conditions.
Abstract
Markov chains are fundamental models for stochastic dynamics, with applications in a wide range of areas such as population dynamics, queueing systems, reinforcement learning, and Monte Carlo methods. Estimating the transition matrix and stationary distribution from observed sample paths is a core statistical challenge, particularly when multiple independent trajectories are available. While classical theory typically assumes identical chains with known stationary distributions, real-world data often arise from heterogeneous chains whose transition kernels and stationary measures might differ from a common target. We analyse empirical estimators for such parallel Markov processes and establish sharp concentration inequalities that generalise Bernstein-type bounds from standard time averages to ensemble-time averages. Our results provide nonasymptotic error bounds and consistency guarantees in high-dimensional regimes, accommodating sparse or weakly mixing chains, model mismatch, nonstationary initialisations, and partially corrupted data. These findings offer rigorous foundations for statistical inference in heterogeneous Markov chain settings common in modern computational applications.
