Table of Contents
Fetching ...

Forecastability as an Information-Theoretic Limit on Prediction

Peter Maurice Catt

Abstract

Forecasting is usually framed as a problem of model choice. This paper starts earlier, asking how much predictive information is available at each horizon. Under logarithmic loss, the answer is exact: the mutual information between the future observation and the declared information set equals the maximum achievable reduction in expected loss. This paper develops the consequences of that identity. Forecastability, defined as this mutual information evaluated across horizons, forms a profile whose shape reflects the dependence structure of the process and need not be monotone. Three structural properties are derived: compression of the information set can only reduce forecastability; the gap between the profile under a finite lag window and the full history gives an exact truncation error budget; and for processes with periodic dependence, the profile inherits the periodicity. Predictive loss decomposes into an irreducible component fixed by the information structure and an approximation component attributable to the method; their ratio defines the exploitation ratio, a normalised diagnostic for method adequacy. The exact equality is specific to log loss, but when forecastability is near zero, classical inequalities imply that no method under any loss can materially improve on the unconditional baseline. The framework provides a theoretical foundation for assessing, prior to any modelling, whether the declared information set contains sufficient predictive information at the horizon of interest.

Forecastability as an Information-Theoretic Limit on Prediction

Abstract

Forecasting is usually framed as a problem of model choice. This paper starts earlier, asking how much predictive information is available at each horizon. Under logarithmic loss, the answer is exact: the mutual information between the future observation and the declared information set equals the maximum achievable reduction in expected loss. This paper develops the consequences of that identity. Forecastability, defined as this mutual information evaluated across horizons, forms a profile whose shape reflects the dependence structure of the process and need not be monotone. Three structural properties are derived: compression of the information set can only reduce forecastability; the gap between the profile under a finite lag window and the full history gives an exact truncation error budget; and for processes with periodic dependence, the profile inherits the periodicity. Predictive loss decomposes into an irreducible component fixed by the information structure and an approximation component attributable to the method; their ratio defines the exploitation ratio, a normalised diagnostic for method adequacy. The exact equality is specific to log loss, but when forecastability is near zero, classical inequalities imply that no method under any loss can materially improve on the unconditional baseline. The framework provides a theoretical foundation for assessing, prior to any modelling, whether the declared information set contains sufficient predictive information at the horizon of interest.

Paper Structure

This paper contains 20 sections, 8 theorems, 19 equations, 1 figure.

Key Result

Theorem 1

For any predictive distribution $q_h(\cdot \mid \mathcal{I}_t)$ satisfying Assumption 1, Consequently, with equality if and only if $q_h = p_h^\star$ almost surely.

Figures (1)

  • Figure 1: Forecastability profiles for Gaussian processes. (a) AR(1) with $\phi = 0.95$ (solid) and $\phi = 0.3$ (dashed), using the full history. (b) Multiplicative seasonal $\mathrm{AR}(1) \times (1)_{12}$ with $\phi = 0.5$, $\Phi = 0.8$, using $\mathcal{I}_t = \sigma(Y_t)$ to illustrate non-monotonicity.

Theorems & Definitions (16)

  • Theorem 1: Conditional-distribution optimality
  • proof
  • Theorem 2: Information bound on prediction
  • proof
  • Definition 3: Forecastability
  • Proposition 4: Maximum forecastability and entropy rate
  • proof
  • Corollary 5: Forecastability under information-set compression
  • proof
  • Proposition 6: Finite-window information loss
  • ...and 6 more