Table of Contents
Fetching ...

Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models

Ali Behrouz, Michele Santacatterina, Ramin Zabih

TL;DR

Chimera tackles multivariate time series by introducing a three-headed 2D state-space framework that models dependencies along both time and variate dimensions. It combines two input-dependent 2D SSM heads to capture long-term progression and seasonal patterns, and uses a fast 2D prefix-sum based scan to enable efficient training. The architecture includes data-dependent gating, separate seasonal and trend modules, and a closed-loop decoder to extend forecasting horizons, with theoretical results showing it can recover classical methods and express full-rank kernels with few parameters. Empirically, Chimera delivers state-of-the-art or competitive performance across classification, forecasting, and anomaly detection benchmarks while offering favorable training efficiency and memory usage, and case studies on brain activity underscore the importance of adaptive variate dependencies. The work paves a path for applying 2D SSMs to broader high-dimensional data, including potential extensions to images, videos, and multi-channel signals.

Abstract

Modeling multivariate time series is a well-established problem with a wide range of applications from healthcare to financial markets. Traditional State Space Models (SSMs) are classical approaches for univariate time series modeling due to their simplicity and expressive power to represent linear dependencies. They, however, have fundamentally limited expressive power to capture non-linear dependencies, are slow in practice, and fail to model the inter-variate information flow. Despite recent attempts to improve the expressive power of SSMs by using deep structured SSMs, the existing methods are either limited to univariate time series, fail to model complex patterns (e.g., seasonal patterns), fail to dynamically model the dependencies of variate and time dimensions, and/or are input-independent. We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns. To improve the efficiency of complex 2D recurrence, we present a fast training using a new 2-dimensional parallel selective scan. We further present and discuss 2-dimensional Mamba and Mamba-2 as the spacial cases of our 2D SSM. Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks, including ECG and speech time series classification, long-term and short-term time series forecasting, and time series anomaly detection.

Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models

TL;DR

Chimera tackles multivariate time series by introducing a three-headed 2D state-space framework that models dependencies along both time and variate dimensions. It combines two input-dependent 2D SSM heads to capture long-term progression and seasonal patterns, and uses a fast 2D prefix-sum based scan to enable efficient training. The architecture includes data-dependent gating, separate seasonal and trend modules, and a closed-loop decoder to extend forecasting horizons, with theoretical results showing it can recover classical methods and express full-rank kernels with few parameters. Empirically, Chimera delivers state-of-the-art or competitive performance across classification, forecasting, and anomaly detection benchmarks while offering favorable training efficiency and memory usage, and case studies on brain activity underscore the importance of adaptive variate dependencies. The work paves a path for applying 2D SSMs to broader high-dimensional data, including potential extensions to images, videos, and multi-channel signals.

Abstract

Modeling multivariate time series is a well-established problem with a wide range of applications from healthcare to financial markets. Traditional State Space Models (SSMs) are classical approaches for univariate time series modeling due to their simplicity and expressive power to represent linear dependencies. They, however, have fundamentally limited expressive power to capture non-linear dependencies, are slow in practice, and fail to model the inter-variate information flow. Despite recent attempts to improve the expressive power of SSMs by using deep structured SSMs, the existing methods are either limited to univariate time series, fail to model complex patterns (e.g., seasonal patterns), fail to dynamically model the dependencies of variate and time dimensions, and/or are input-independent. We present Chimera that uses two input-dependent 2-D SSM heads with different discretization processes to learn long-term progression and seasonal patterns. To improve the efficiency of complex 2D recurrence, we present a fast training using a new 2-dimensional parallel selective scan. We further present and discuss 2-dimensional Mamba and Mamba-2 as the spacial cases of our 2D SSM. Our experimental evaluation shows the superior performance of Chimera on extensive and diverse benchmarks, including ECG and speech time series classification, long-term and short-term time series forecasting, and time series anomaly detection.
Paper Structure (31 sections, 6 theorems, 39 equations, 6 figures, 11 tables)

This paper contains 31 sections, 6 theorems, 39 equations, 6 figures, 11 tables.

Key Result

Proposition 3.1

The 2D discrete SSM introduced in eq:ssm-main1-eq:ssm-main3 with parameters $(\{\bar{\mathbf{A}}_i\}, \{\bar{\mathbf{B}}_i\}, \{\bar{\mathbf{C}}_i\}, k \Delta_1, \ell \Delta_2)$ evolves at a rate $k$ (resp. $\ell$) times as fast as the 2D discrete SSM with parameters $(\{\bar{\mathbf{A}}_i\}, \{\bar

Figures (6)

  • Figure 1: The Overview of Contributions and Architecture of Chimera. We present a 2-dimensional SSM with careful and expressive parameterization. It uses different learnable discretization processes to learn seasonal and long-term progression patterns, and leverages a parallelizable and fast training process by re-formulating the 2D input dependent recurrence as a 2D prefix sum problem.
  • Figure 2: Different forms of Chimera.(Top-Left) Chimera has a recurrence form (bi-directional along the variates), which also can be computed as a global convolution in training. (Top-Right) In forecasting, we present the multivariate closed-loop to improve the performance for long horizons. (Bottom) Using data-dependent parameters, Chimera training can be done as a parallel 2D scan.
  • Figure 3: Classification and anomaly detection performance. Full list with additional baselines is in \ref{['app:experiments']}.
  • Figure 4: Wall-clock scaling.
  • Figure 5: Selection results in generalization to unseen variates.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Proposition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 3.5
  • Theorem 3.6
  • Definition D.1: Companion Matrix