Table of Contents
Fetching ...

Using dynamic loss weighting to boost improvements in forecast stability

Daan Caljon, Jeff Vercauteren, Simon De Vos, Wouter Verbeke, Jente Van Belle

TL;DR

This work tackles rolling origin forecast instability by extending the N-BEATS framework with a stability objective and exploring dynamic loss weighting (DLW) to adjust the balance between forecast error and instability during training. It systematically evaluates multiple DLW methods, including a novel Task-Aware Random Weighting (TARW) variant, on M3 and M4 monthly data, showing that several DLW approaches can improve stability without harming accuracy and that TARW often achieves the best trade-off. The findings reveal that DLW methods provide Pareto-efficient options for improving stability and offer practical guidance on when to prioritize stability vs. accuracy during training. The work suggests TARW as a simple, strong baseline for auxiliary-learning setups and highlights potential extensions to other models, datasets, and instability notions for broader applicability.

Abstract

Rolling origin forecast instability refers to variability in forecasts for a specific period induced by updating the forecast when new data points become available. Recently, an extension to the N-BEATS model for univariate time series point forecasting was proposed to include forecast stability as an additional optimization objective, next to accuracy. It was shown that more stable forecasts can be obtained without harming accuracy by minimizing a composite loss function that contains both a forecast error and a forecast instability component, with a static hyperparameter to control the impact of stability. In this paper, we empirically investigate whether further improvements in stability can be obtained without compromising accuracy by applying dynamic loss weighting algorithms, which change the loss weights during training. We show that existing dynamic loss weighting methods can achieve this objective and provide insights into why this might be the case. Additionally, we propose an extension to the Random Weighting approach -- Task-Aware Random Weighting -- which also achieves this objective.

Using dynamic loss weighting to boost improvements in forecast stability

TL;DR

This work tackles rolling origin forecast instability by extending the N-BEATS framework with a stability objective and exploring dynamic loss weighting (DLW) to adjust the balance between forecast error and instability during training. It systematically evaluates multiple DLW methods, including a novel Task-Aware Random Weighting (TARW) variant, on M3 and M4 monthly data, showing that several DLW approaches can improve stability without harming accuracy and that TARW often achieves the best trade-off. The findings reveal that DLW methods provide Pareto-efficient options for improving stability and offer practical guidance on when to prioritize stability vs. accuracy during training. The work suggests TARW as a simple, strong baseline for auxiliary-learning setups and highlights potential extensions to other models, datasets, and instability notions for broader applicability.

Abstract

Rolling origin forecast instability refers to variability in forecasts for a specific period induced by updating the forecast when new data points become available. Recently, an extension to the N-BEATS model for univariate time series point forecasting was proposed to include forecast stability as an additional optimization objective, next to accuracy. It was shown that more stable forecasts can be obtained without harming accuracy by minimizing a composite loss function that contains both a forecast error and a forecast instability component, with a static hyperparameter to control the impact of stability. In this paper, we empirically investigate whether further improvements in stability can be obtained without compromising accuracy by applying dynamic loss weighting algorithms, which change the loss weights during training. We show that existing dynamic loss weighting methods can achieve this objective and provide insights into why this might be the case. Additionally, we propose an extension to the Random Weighting approach -- Task-Aware Random Weighting -- which also achieves this objective.
Paper Structure (20 sections, 6 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 20 sections, 6 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Generic N-BEATS architecture. Figure sourced from vanbelle2023.
  • Figure 2: Pareto frontiers for the M3 and M4 data sets.
  • Figure 3: MCB results for the M3 monthly data set. Lower is better. If two intervals overlap, there is no statistically significant difference between the corresponding methods.
  • Figure 4: MCB results for the M4 monthly data set. Lower is better. If two intervals overlap, there is no statistically significant difference between the corresponding methods.
  • Figure 5: Evolution of $\lambda_i$ during training using GradNorm.
  • ...and 5 more figures