Table of Contents
Fetching ...

Universal time-series forecasting with mixture predictors

Daniil Ryabko

TL;DR

This work establishes that universal sequential forecasting can be achieved via mixture predictors across broad probabilistic settings, with strong finite-time KL guarantees and minimax optimality in the realizable case. It provides a precise TV-based characterization of when a predictor exists through separability and Lebesgue-type decomposition, and shows that in KL loss the minimax predictor is achievable by mixtures while non-realizable cases can resist Bayesian mixtures. The middle-case framework connects realizable and non-realizable analyses, showing mixtures remain effective under weaker assumptions, and decision-theoretic interpretations tie these results to minimax and complete-class concepts. Collectively, the results supply a rigorous foundation for using mixture predictors in nonparametric, high-capacity sequence-prediction problems and illuminate when enlarging the model class is advantageous.

Abstract

This book is devoted to the problem of sequential probability forecasting, that is, predicting the probabilities of the next outcome of a growing sequence of observations given the past. This problem is considered in a very general setting that unifies commonly used probabilistic and non-probabilistic settings, trying to make as few as possible assumptions on the mechanism generating the observations. A common form that arises in various formulations of this problem is that of mixture predictors, which are formed as a combination of a finite or infinite set of other predictors attempting to combine their predictive powers. The main subject of this book are such mixture predictors, and the main results demonstrate the universality of this method in a very general probabilistic setting, but also show some of its limitations. While the problems considered are motivated by practical applications, involving, for example, financial, biological or behavioural data, this motivation is left implicit and all the results exposed are theoretical. The book targets graduate students and researchers interested in the problem of sequential prediction, and, more generally, in theoretical analysis of problems in machine learning and non-parametric statistics, as well as mathematical and philosophical foundations of these fields. The material in this volume is presented in a way that presumes familiarity with basic concepts of probability and statistics, up to and including probability distributions over spaces of infinite sequences. Familiarity with the literature on learning or stochastic processes is not required.

Universal time-series forecasting with mixture predictors

TL;DR

This work establishes that universal sequential forecasting can be achieved via mixture predictors across broad probabilistic settings, with strong finite-time KL guarantees and minimax optimality in the realizable case. It provides a precise TV-based characterization of when a predictor exists through separability and Lebesgue-type decomposition, and shows that in KL loss the minimax predictor is achievable by mixtures while non-realizable cases can resist Bayesian mixtures. The middle-case framework connects realizable and non-realizable analyses, showing mixtures remain effective under weaker assumptions, and decision-theoretic interpretations tie these results to minimax and complete-class concepts. Collectively, the results supply a rigorous foundation for using mixture predictors in nonparametric, high-capacity sequence-prediction problems and illuminate when enlarging the model class is advantageous.

Abstract

This book is devoted to the problem of sequential probability forecasting, that is, predicting the probabilities of the next outcome of a growing sequence of observations given the past. This problem is considered in a very general setting that unifies commonly used probabilistic and non-probabilistic settings, trying to make as few as possible assumptions on the mechanism generating the observations. A common form that arises in various formulations of this problem is that of mixture predictors, which are formed as a combination of a finite or infinite set of other predictors attempting to combine their predictive powers. The main subject of this book are such mixture predictors, and the main results demonstrate the universality of this method in a very general probabilistic setting, but also show some of its limitations. While the problems considered are motivated by practical applications, involving, for example, financial, biological or behavioural data, this motivation is left implicit and all the results exposed are theoretical. The book targets graduate students and researchers interested in the problem of sequential prediction, and, more generally, in theoretical analysis of problems in machine learning and non-parametric statistics, as well as mathematical and philosophical foundations of these fields. The material in this volume is presented in a way that presumes familiarity with basic concepts of probability and statistics, up to and including probability distributions over spaces of infinite sequences. Familiarity with the literature on learning or stochastic processes is not required.

Paper Structure

This paper contains 53 sections, 32 theorems, 169 equations.

Key Result

theorem 1

For every set $\mathcal{C}$ of probability measures and for every predictor $\rho$ (measure) there is a mixture predictor $\nu$ over $\mathcal{C}$ such that for every $\mu\in\mathcal{C}$ we have with only small constants hidden inside $O()$.

Theorems & Definitions (77)

  • theorem 1
  • definition thmcounterdefinition: Classes of processes: all, discrete, Markov, stationary
  • lemma thmcounterlemma: It is impossible to predict every process
  • Proof 1
  • definition thmcounterdefinition: Mixture predictors
  • definition thmcounterdefinition: unconditional total variation distance
  • definition thmcounterdefinition
  • definition thmcounterdefinition
  • definition thmcounterdefinition: absolute continuity; dominance
  • theorem 2: Blackwell:62Kalai:94
  • ...and 67 more