Table of Contents
Fetching ...

Effectively Modeling Time Series with Simple Discrete State Spaces

Michael Zhang, Khaled K. Saab, Michael Poli, Tri Dao, Karan Goel, Christopher Ré

TL;DR

The paper tackles time series modeling by introducing SpaceTime, a deep architecture built on companion-matrix state-space models (SSMs) to express autoregressive dynamics. It achieves expressivity through multiple companion SSMs per layer, enables long-horizon forecasting with a closed-loop SSM variant, and attains subquadratic training/inference via a FFT-based algorithm for output-filter computation. Empirically, SpaceTime delivers state-of-the-art or near-state-of-the-art results across ECG and speech classification, and forecasting benchmarks (Informer/Monash), while providing substantial speedups over Transformers and LSTMs. The work demonstrates improved ARIMA-like expressivity, robust long-horizon forecasting, and practical efficiency, highlighting SpaceTime as a broadly effective time series modeling framework with real-world applicability.

Abstract

Time series modeling is a well-established problem, which often requires that methods (1) expressively represent complicated dependencies, (2) forecast long horizons, and (3) efficiently train over long sequences. State-space models (SSMs) are classical models for time series, and prior works combine SSMs with deep learning layers for efficient sequence modeling. However, we find fundamental limitations with these prior approaches, proving their SSM representations cannot express autoregressive time series processes. We thus introduce SpaceTime, a new state-space time series architecture that improves all three criteria. For expressivity, we propose a new SSM parameterization based on the companion matrix -- a canonical representation for discrete-time processes -- which enables SpaceTime's SSM layers to learn desirable autoregressive processes. For long horizon forecasting, we introduce a "closed-loop" variation of the companion SSM, which enables SpaceTime to predict many future time-steps by generating its own layer-wise inputs. For efficient training and inference, we introduce an algorithm that reduces the memory and compute of a forward pass with the companion matrix. With sequence length $\ell$ and state-space size $d$, we go from $\tilde{O}(d \ell)$ naïvely to $\tilde{O}(d + \ell)$. In experiments, our contributions lead to state-of-the-art results on extensive and diverse benchmarks, with best or second-best AUROC on 6 / 7 ECG and speech time series classification, and best MSE on 14 / 16 Informer forecasting tasks. Furthermore, we find SpaceTime (1) fits AR($p$) processes that prior deep SSMs fail on, (2) forecasts notably more accurately on longer horizons than prior state-of-the-art, and (3) speeds up training on real-world ETTh1 data by 73% and 80% relative wall-clock time over Transformers and LSTMs.

Effectively Modeling Time Series with Simple Discrete State Spaces

TL;DR

The paper tackles time series modeling by introducing SpaceTime, a deep architecture built on companion-matrix state-space models (SSMs) to express autoregressive dynamics. It achieves expressivity through multiple companion SSMs per layer, enables long-horizon forecasting with a closed-loop SSM variant, and attains subquadratic training/inference via a FFT-based algorithm for output-filter computation. Empirically, SpaceTime delivers state-of-the-art or near-state-of-the-art results across ECG and speech classification, and forecasting benchmarks (Informer/Monash), while providing substantial speedups over Transformers and LSTMs. The work demonstrates improved ARIMA-like expressivity, robust long-horizon forecasting, and practical efficiency, highlighting SpaceTime as a broadly effective time series modeling framework with real-world applicability.

Abstract

Time series modeling is a well-established problem, which often requires that methods (1) expressively represent complicated dependencies, (2) forecast long horizons, and (3) efficiently train over long sequences. State-space models (SSMs) are classical models for time series, and prior works combine SSMs with deep learning layers for efficient sequence modeling. However, we find fundamental limitations with these prior approaches, proving their SSM representations cannot express autoregressive time series processes. We thus introduce SpaceTime, a new state-space time series architecture that improves all three criteria. For expressivity, we propose a new SSM parameterization based on the companion matrix -- a canonical representation for discrete-time processes -- which enables SpaceTime's SSM layers to learn desirable autoregressive processes. For long horizon forecasting, we introduce a "closed-loop" variation of the companion SSM, which enables SpaceTime to predict many future time-steps by generating its own layer-wise inputs. For efficient training and inference, we introduce an algorithm that reduces the memory and compute of a forward pass with the companion matrix. With sequence length and state-space size , we go from naïvely to . In experiments, our contributions lead to state-of-the-art results on extensive and diverse benchmarks, with best or second-best AUROC on 6 / 7 ECG and speech time series classification, and best MSE on 14 / 16 Informer forecasting tasks. Furthermore, we find SpaceTime (1) fits AR() processes that prior deep SSMs fail on, (2) forecasts notably more accurately on longer horizons than prior state-of-the-art, and (3) speeds up training on real-world ETTh1 data by 73% and 80% relative wall-clock time over Transformers and LSTMs.
Paper Structure (46 sections, 5 theorems, 43 equations, 8 figures, 19 tables, 1 algorithm)

This paper contains 46 sections, 5 theorems, 43 equations, 8 figures, 19 tables, 1 algorithm.

Key Result

Proposition 1

A companion state matrix SSM can represent ARIMA box1970time, exponential smoothing winters1960forecastingholt2004forecasting, and controllable linear time--invariant systems chen1984linear.

Figures (8)

  • Figure 1: We learn time series processes as state-space models (SSMs) (top left). We represent SSMs with the companion matrix, which is a highly expressive representation for discrete time series (top middle), and compute such SSMs efficiently as convolutions or recurrences via a shift + low-rank decomposition (top right). We use these SSMs to build SpaceTime, a new time series architecture broadly effective across tasks and domains (bottom).
  • Figure 2: SpaceTime architecture and components. (Left): Each SpaceTime layer carries weights that model multiple companion SSMs, followed optionally by a nonlinear FFN. The SSMs are learned in parallel (1) and computed as a single matrix multiplication (2). (Right): We stack these layers into a SpaceTime network, where earlier layers compute SSMs as convolutions for fast sequence-to-sequence modeling and data preprocessing, while a decoder layer computes SSMs as recurrences for dynamic forecasting.
  • Figure 3: AR($p$) expressiveness benchmarks. SpaceTime captures AR($p$) processes more precisely than similar deep SSM models such as S4 gu2021efficiently and S4D gu2022parameterization, forecasting future samples and learning ground-truth transfer functions more accurately.
  • Figure 4: Longer horizon forecasting on Informer ETTh data. Standardized MSE reported. SpaceTime obtains lower MSE when forecasting longer horizons.
  • Figure 5: Train wall-clock time. Seconds per epoch when training on ETTh1 data.
  • ...and 3 more figures

Theorems & Definitions (10)

  • Proposition 1
  • Proposition 2
  • proof : Proof of Proposition \ref{['prop:ssm_companion_expressiveness_td_bluf']}
  • proof : Proof of Proposition \ref{['prop:continuous_ssm_no_ar_td_bluf']}
  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Proposition 3: Bounded SpaceTime Gradients
  • proof