PAC-Bayes Bounds on Variational Tempered Posteriors for Markov Models
Imon Banerjee, Vinayak A. Rao, Harsha Honnappa
TL;DR
The paper develops PAC-Bayesian bounds for variational Bayes approximations to tempered posteriors over Markov-chain parameters, linking the resulting risk to the mixing and ergodic properties of the data-generating process. It extends existing i.i.d.-based results to dependent data, providing finite-sample risk bounds for stationary and non-stationary Markov chains and for misspecified models, using $\alpha$-Rényi divergences and KL-based variational objectives. The theory is illustrated on several Markov models (finite/infinite state birth–death chains, Gaussian linear models) and yields conditions under which variational posteriors concentrate despite temporal dependence and misspecification. The results offer principled guarantees for tempered variational inference in temporally dependent settings and guide model-checking under misspecification, with avenues for future work on continuous-time and non-homogeneous Markov dynamics.
Abstract
Datasets displaying temporal dependencies abound in science and engineering applications, with Markov models representing a simplified and popular view of the temporal dependence structure. In this paper, we consider Bayesian settings that place prior distributions over the parameters of the transition kernel of a Markov model, and seeks to characterize the resulting, typically intractable, posterior distributions. We present a PAC-Bayesian analysis of variational Bayes (VB) approximations to tempered Bayesian posterior distributions, bounding the model risk of the VB approximations. Tempered posteriors are known to be robust to model misspecification, and their variational approximations do not suffer the usual problems of over confident approximations. Our results tie the risk bounds to the mixing and ergodic properties of the Markov data generating model. We illustrate the PAC-Bayes bounds through a number of example Markov models, and also consider the situation where the Markov model is misspecified.
