Table of Contents
Fetching ...

Lag Selection for Univariate Time Series Forecasting using Deep Learning: An Empirical Study

José Leites, Vitor Cerqueira, Carlos Soares

TL;DR

This paper tackles the problem of selecting the number of lags $p$ for univariate time series forecasting using deep learning in a global setting, focusing on the NHITS architecture. It conducts an extensive empirical analysis across three monthly datasets totaling 2411 time series and 321,734 observations, comparing multiple lag-selection methods, including cross-validation (CV), PACF thresholds, and heuristic rules, using the forecast horizon $H$ contained in each dataset. The key finding is that lag size substantially impacts forecast accuracy, with both very small and very large windows harming performance; CV-based lag selection provides the best overall performance, while PACF@0.01 and several heuristics perform comparably, highlighting a trade-off between accuracy and computation. The results underscore the value of adaptive lag strategies for global DL forecasting and motivate future work on ensemble or dynamic lag selection, with code provided for reproducibility.

Abstract

Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics.

Lag Selection for Univariate Time Series Forecasting using Deep Learning: An Empirical Study

TL;DR

This paper tackles the problem of selecting the number of lags for univariate time series forecasting using deep learning in a global setting, focusing on the NHITS architecture. It conducts an extensive empirical analysis across three monthly datasets totaling 2411 time series and 321,734 observations, comparing multiple lag-selection methods, including cross-validation (CV), PACF thresholds, and heuristic rules, using the forecast horizon contained in each dataset. The key finding is that lag size substantially impacts forecast accuracy, with both very small and very large windows harming performance; CV-based lag selection provides the best overall performance, while PACF@0.01 and several heuristics perform comparably, highlighting a trade-off between accuracy and computation. The results underscore the value of adaptive lag strategies for global DL forecasting and motivate future work on ensemble or dynamic lag selection, with code provided for reproducibility.

Abstract

Most forecasting methods use recent past observations (lags) to model the future values of univariate time series. Selecting an adequate number of lags is important for training accurate forecasting models. Several approaches and heuristics have been devised to solve this task. However, there is no consensus about what the best approach is. Besides, lag selection procedures have been developed based on local models and classical forecasting techniques such as ARIMA. We bridge this gap in the literature by carrying out an extensive empirical analysis of different lag selection methods. We focus on deep learning methods trained in a global approach, i.e., on datasets comprising multiple univariate time series. The experiments were carried out using three benchmark databases that contain a total of 2411 univariate time series. The results indicate that the lag size is a relevant parameter for accurate forecasts. In particular, excessively small or excessively large lag sizes have a considerable negative impact on forecasting performance. Cross-validation approaches show the best performance for lag selection, but this performance is comparable with simple heuristics.
Paper Structure (15 sections, 1 equation, 2 figures, 3 tables)

This paper contains 15 sections, 1 equation, 2 figures, 3 tables.

Figures (2)

  • Figure 1: SMAPE scores for an increasing number of lags across each dataset. The dashed red line denotes the performance of the seasonal naive baseline.
  • Figure 2: Rank distribution of each lag across all time series in the M3 dataset. For visualization purposes, we truncate the analysis to the first 60 lags.