Optimal starting point for time series forecasting
Yiming Zhong, Yinuo Ren, Guangyao Cao, Feng Li, Haobo Qi
TL;DR
The paper tackles forecast degradation caused by structural breaks and concept drift in time series, proposing the Optimal Starting Point Time Series Forecast (OSP-TSP) framework to automatically identify an optimal starting interval for forecasting. The approach learns an interval predictor from time-series features with $XGBoost$ or $LightGBM$, then generates forecasts by applying the base model to $n$ subsequences sampled within the predicted interval and averaging the results, evaluated across frequencies on the M4 dataset and other real-world data. Key contributions include a feature-driven method to locate the promising forecasting sub-sequence, demonstration of improved $MASE$ and often $MAPE$ performance when using OSP-TSP with various base models (e.g., ETS, thetaf, ARIMA, nnetar, FFORMA), and analysis of feature importance (e.g., curvature, linearity, seas_acf1). The findings suggest that restricting forecasts to a data-drivenly selected subsequence can mitigate the impact of structural changes, with practical implications for deploying robust time-series forecasting in dynamic environments; the paper also discusses pre-training, data augmentation via GRATIS, and future work on computation and weighting schemes.
Abstract
Recent advances on time series forecasting mainly focus on improving the forecasting models themselves. However, when the time series data suffer from potential structural breaks or concept drifts, the forecasting performance might be significantly reduced. In this paper, we introduce a novel approach called Optimal Starting Point Time Series Forecast (OSP-TSP) for optimal forecasting, which can be combined with existing time series forecasting models. By adjusting the sequence length via leveraging the XGBoost and LightGBM models, the proposed approach can determine the optimal starting point (OSP) of the time series and then enhance the prediction performances of the base forecasting models. To illustrate the effectiveness of the proposed approach, comprehensive empirical analysis have been conducted on the M4 dataset and other real world datasets. Empirical results indicate that predictions based on the OSP-TSP approach consistently outperform those using the complete time series dataset. Moreover, comparison results reveals that combining our approach with existing forecasting models can achieve better prediction accuracy, which also reflect the advantages of the proposed approach.
