Table of Contents
Fetching ...

Time Series Analysis by State Space Learning

André Ramos, Davi Valladão, Alexandre Street

TL;DR

SSL rethinks time-series state-space models by embedding them in a high-dimensional regularized regression framework, replacing Kalman-filter reliance with convex optimization and Elastic Net/Adaptive Lasso principles. The key idea is to rewrite unobserved components and exogenous effects as Y = X Θ + ε and estimate Θ through a two-stage adaptive procedure that simultaneously performs component extraction and variable subset selection, with interpolation and extrapolation capabilities. The paper extends SSL with refinements for exogenous-variable selection, outlier handling, and initialization for Gaussian structural models, and demonstrates a generalized SSL for broader additive state-space forms, all implemented in StateSpaceLearning.jl. Empirically, SSL outperforms traditional Gaussian state-space models and Auto-SARIMA on the M4 dataset and in controlled simulations, offering improved predictive accuracy, robust forecasting, and scalable computation for high-dimensional time-series problems.

Abstract

Time series analysis by state-space models is widely used in forecasting and extracting unobservable components like level, slope, and seasonality, along with explanatory variables. However, their reliance on traditional Kalman filtering frequently hampers their effectiveness, primarily due to Gaussian assumptions and the absence of efficient subset selection methods to accommodate the multitude of potential explanatory variables in today's big-data applications. Our research introduces the State Space Learning (SSL), a novel framework and paradigm that leverages the capabilities of statistical learning to construct a comprehensive framework for time series modeling and forecasting. By utilizing a regularized high-dimensional regression framework, our approach jointly extracts typical time series unobservable components, detects and addresses outliers, and selects the influence of exogenous variables within a high-dimensional space in polynomial time and global optimality guarantees. Through a controlled numerical experiment, we demonstrate the superiority of our approach in terms of subset selection of explanatory variables accuracy compared to relevant benchmarks. We also present an intuitive forecasting scheme and showcase superior performances relative to traditional time series models using a dataset of 48,000 monthly time series from the M4 competition. We extend the applicability of our approach to reformulate any linear state space formulation featuring time-varying coefficients into high-dimensional regularized regressions, expanding the impact of our research to other engineering applications beyond time series analysis. Finally, our proposed methodology is implemented within the Julia open-source package, ``StateSpaceLearning.jl".

Time Series Analysis by State Space Learning

TL;DR

SSL rethinks time-series state-space models by embedding them in a high-dimensional regularized regression framework, replacing Kalman-filter reliance with convex optimization and Elastic Net/Adaptive Lasso principles. The key idea is to rewrite unobserved components and exogenous effects as Y = X Θ + ε and estimate Θ through a two-stage adaptive procedure that simultaneously performs component extraction and variable subset selection, with interpolation and extrapolation capabilities. The paper extends SSL with refinements for exogenous-variable selection, outlier handling, and initialization for Gaussian structural models, and demonstrates a generalized SSL for broader additive state-space forms, all implemented in StateSpaceLearning.jl. Empirically, SSL outperforms traditional Gaussian state-space models and Auto-SARIMA on the M4 dataset and in controlled simulations, offering improved predictive accuracy, robust forecasting, and scalable computation for high-dimensional time-series problems.

Abstract

Time series analysis by state-space models is widely used in forecasting and extracting unobservable components like level, slope, and seasonality, along with explanatory variables. However, their reliance on traditional Kalman filtering frequently hampers their effectiveness, primarily due to Gaussian assumptions and the absence of efficient subset selection methods to accommodate the multitude of potential explanatory variables in today's big-data applications. Our research introduces the State Space Learning (SSL), a novel framework and paradigm that leverages the capabilities of statistical learning to construct a comprehensive framework for time series modeling and forecasting. By utilizing a regularized high-dimensional regression framework, our approach jointly extracts typical time series unobservable components, detects and addresses outliers, and selects the influence of exogenous variables within a high-dimensional space in polynomial time and global optimality guarantees. Through a controlled numerical experiment, we demonstrate the superiority of our approach in terms of subset selection of explanatory variables accuracy compared to relevant benchmarks. We also present an intuitive forecasting scheme and showcase superior performances relative to traditional time series models using a dataset of 48,000 monthly time series from the M4 competition. We extend the applicability of our approach to reformulate any linear state space formulation featuring time-varying coefficients into high-dimensional regularized regressions, expanding the impact of our research to other engineering applications beyond time series analysis. Finally, our proposed methodology is implemented within the Julia open-source package, ``StateSpaceLearning.jl".
Paper Structure (18 sections, 6 theorems, 44 equations, 4 figures, 3 tables)

This paper contains 18 sections, 6 theorems, 44 equations, 4 figures, 3 tables.

Key Result

Proposition 2.1

The slope component $\nu_{t+1} = \nu_{t} + \zeta_{t+1}$ can be formulated as $\nu_{t+1} = \nu_1 + \sum_{\tau=2}^{t+1}\zeta_{\tau}$.

Figures (4)

  • Figure 1: In this example we see in Figures \ref{['fig:sub1']}, \ref{['fig:sub2']} the components $\mu_1 + \sum_{\tau=2}^{t}\xi_\tau$ and in Figures \ref{['fig:sub4']}, \ref{['fig:sub5']} the components $(t-1)\nu_1 + \sum_{\tau=2}^{t-1}(t-\tau)\zeta_\tau$.
  • Figure 2: Fitted time series and separated components (each row represents one time series) for both the gaussian and the SSL $(\alpha=0.1-AIC)$ models.
  • Figure 3: Model Forecasts
  • Figure 4: Model Forecasts

Theorems & Definitions (12)

  • Proposition 2.1
  • proof
  • Theorem 2.1
  • proof
  • Theorem 2.2
  • proof
  • Corollary 2.2.1
  • proof
  • Proposition 2.2
  • proof
  • ...and 2 more