Universal time-series forecasting with mixture predictors
Daniil Ryabko
TL;DR
This work establishes that universal sequential forecasting can be achieved via mixture predictors across broad probabilistic settings, with strong finite-time KL guarantees and minimax optimality in the realizable case. It provides a precise TV-based characterization of when a predictor exists through separability and Lebesgue-type decomposition, and shows that in KL loss the minimax predictor is achievable by mixtures while non-realizable cases can resist Bayesian mixtures. The middle-case framework connects realizable and non-realizable analyses, showing mixtures remain effective under weaker assumptions, and decision-theoretic interpretations tie these results to minimax and complete-class concepts. Collectively, the results supply a rigorous foundation for using mixture predictors in nonparametric, high-capacity sequence-prediction problems and illuminate when enlarging the model class is advantageous.
Abstract
This book is devoted to the problem of sequential probability forecasting, that is, predicting the probabilities of the next outcome of a growing sequence of observations given the past. This problem is considered in a very general setting that unifies commonly used probabilistic and non-probabilistic settings, trying to make as few as possible assumptions on the mechanism generating the observations. A common form that arises in various formulations of this problem is that of mixture predictors, which are formed as a combination of a finite or infinite set of other predictors attempting to combine their predictive powers. The main subject of this book are such mixture predictors, and the main results demonstrate the universality of this method in a very general probabilistic setting, but also show some of its limitations. While the problems considered are motivated by practical applications, involving, for example, financial, biological or behavioural data, this motivation is left implicit and all the results exposed are theoretical. The book targets graduate students and researchers interested in the problem of sequential prediction, and, more generally, in theoretical analysis of problems in machine learning and non-parametric statistics, as well as mathematical and philosophical foundations of these fields. The material in this volume is presented in a way that presumes familiarity with basic concepts of probability and statistics, up to and including probability distributions over spaces of infinite sequences. Familiarity with the literature on learning or stochastic processes is not required.
