Lag Selection for Univariate Time Series Forecasting using Deep Learning: An Empirical Study
José Leites, Vitor Cerqueira, Carlos Soares
TL;DR
This paper tackles the problem of selecting the number of lags $p$ for univariate time series forecasting using deep learning in a global setting, focusing on the NHITS architecture. It conducts an extensive empirical analysis across three monthly datasets totaling 2411 time series and 321,734 observations, comparing multiple lag-selection methods, including cross-validation (CV), PACF thresholds, and heuristic rules, using the forecast horizon $H$ contained in each dataset. The key finding is that lag size substantially impacts forecast accuracy, with both very small and very large windows harming performance; CV-based lag selection provides the best overall performance, while PACF@0.01 and several heuristics perform comparably, highlighting a trade-off between accuracy and computation. The results underscore the value of adaptive lag strategies for global DL forecasting and motivate future work on ensemble or dynamic lag selection, with code provided for reproducibility.
Abstract
Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics.
