Table of Contents
Fetching ...

Regularized Ensemble Forecasting for Learning Weights from Historical and Current Forecasts

Han Su, Xiaojia Guo, Xiaoke Zhang

TL;DR

This paper introduces Regularized Ensemble Forecasting (REF), a flexible framework that optimally combines current expert forecasts with historical performance through a regularized variance-minimization objective. By allowing a monotone transformation and penalty function, REF can emulate Bayesian posterior updates under various priors and distributions, providing a principled link between frequentist ensemble weights and Bayesian inference. The method admits practical implementations via softmax parameterization and rolling-window tuning of the regularization strength, with nuisance parameters estimated from data. Empirically, REF outperforms standard current-only and history-only baselines in the M5 Walmart dataset and the SPF macroeconomic forecasts, while offering insights into when current forecasts or historical performance drive improvements. The work highlights the adaptability of REF to changing expert pools and horizons and points to future extensions to probabilistic forecasts and non-continuous outcomes.

Abstract

Combining forecasts from multiple experts often yields more accurate results than relying on a single expert. In this paper, we introduce a novel regularized ensemble method that extends the traditional linear opinion pool by leveraging both current forecasts and historical performances to set the weights. Unlike existing approaches that rely only on either the current forecasts or past accuracy, our method accounts for both sources simultaneously. It learns weights by minimizing the variance of the combined forecast (or its transformed version) while incorporating a regularization term informed by historical performances. We also show that this approach has a Bayesian interpretation. Different distributional assumptions within this Bayesian framework yield different functional forms for the variance component and the regularization term, adapting the method to various scenarios. In empirical studies on Walmart sales and macroeconomic forecasting, our ensemble outperforms leading benchmark models both when experts' full forecasting histories are available and when experts enter and exit over time, resulting in incomplete historical records. Throughout, we provide illustrative examples that show how the optimal weights are determined and, based on the empirical results, we discuss where the framework's strengths lie and when experts' past versus current forecasts are more informative.

Regularized Ensemble Forecasting for Learning Weights from Historical and Current Forecasts

TL;DR

This paper introduces Regularized Ensemble Forecasting (REF), a flexible framework that optimally combines current expert forecasts with historical performance through a regularized variance-minimization objective. By allowing a monotone transformation and penalty function, REF can emulate Bayesian posterior updates under various priors and distributions, providing a principled link between frequentist ensemble weights and Bayesian inference. The method admits practical implementations via softmax parameterization and rolling-window tuning of the regularization strength, with nuisance parameters estimated from data. Empirically, REF outperforms standard current-only and history-only baselines in the M5 Walmart dataset and the SPF macroeconomic forecasts, while offering insights into when current forecasts or historical performance drive improvements. The work highlights the adaptability of REF to changing expert pools and horizons and points to future extensions to probabilistic forecasts and non-continuous outcomes.

Abstract

Combining forecasts from multiple experts often yields more accurate results than relying on a single expert. In this paper, we introduce a novel regularized ensemble method that extends the traditional linear opinion pool by leveraging both current forecasts and historical performances to set the weights. Unlike existing approaches that rely only on either the current forecasts or past accuracy, our method accounts for both sources simultaneously. It learns weights by minimizing the variance of the combined forecast (or its transformed version) while incorporating a regularization term informed by historical performances. We also show that this approach has a Bayesian interpretation. Different distributional assumptions within this Bayesian framework yield different functional forms for the variance component and the regularization term, adapting the method to various scenarios. In empirical studies on Walmart sales and macroeconomic forecasting, our ensemble outperforms leading benchmark models both when experts' full forecasting histories are available and when experts enter and exit over time, resulting in incomplete historical records. Throughout, we provide illustrative examples that show how the optimal weights are determined and, based on the empirical results, we discuss where the framework's strengths lie and when experts' past versus current forecasts are more informative.
Paper Structure (35 sections, 6 theorems, 104 equations, 6 figures, 17 tables)

This paper contains 35 sections, 6 theorems, 104 equations, 6 figures, 17 tables.

Key Result

Proposition 1

The mean squared prediction error (MSPE) for the simple mean ensemble satisfies

Figures (6)

  • Figure 1: Contours of Expert 1's optimal weight $w_1^*$ for varying prior weight $s_1$ and squared deviation $(\mu_1 - \mathbb{E}(\mu))^2$, holding Expert 2's forecast fixed, under four combinations of the transformation $f$ and penalty $\Phi$.
  • Figure 2: Regularized weight optimization with two experts ($k=2$) under four combinations of the transformation $f$ and penalty $\Phi$: variance level sets (concentric ellipses), penalty feasible region (gray shading), linear constraint $w_1 + w_2 = 1$ (diagonal line), prior weights $\boldsymbol{s}$ and optimal weights (points) for different $\lambda$.
  • Figure 3: Illustration of the rolling-window validation procedure.
  • Figure 4: Performance improvement ($\Delta$RMSSE) vs. penalty share (PS) bins across baseline models in the M5 study.
  • Figure 5: Performance improvement ($\Delta$RMSSE) vs. penalty share (PS) bins across baseline models in the SPF study under different forecast horizons $h=0$ and $h=1$.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Proposition 1
  • Theorem 1
  • Example 1
  • Lemma 1: Corollary 1.(iii) of garg:2017
  • Corollary 1
  • Lemma 2
  • Lemma 3