Table of Contents
Fetching ...

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

Xiaowen Ma, Zhenliang Ni, Shuai Xiao, Xinghao Chen

TL;DR

TimePro tackles the multi-delay challenge in multivariate long-term forecasting by constructing variable- and time-aware hyper-states within a Mamba-based encoder. It combines reversible normalization, time- and variable-preserved patch embeddings, ProBlock stacks, and a HyperMamba module with Hyper-S scan to adaptively tune time points and model variable interactions, achieving $O(NL)$ complexity. The method delivers state-of-the-art results on eight real-world datasets with lower resource usage, supported by comprehensive ablations showing the value of adaptive time tuning and the HyperMamba design. This work offers a scalable, efficient approach for accurate long-horizon forecasting in high-dimensional time series and sets the stage for a large-time-series foundation model.

Abstract

In long-term time series forecasting, different variables often influence the target variable over distinct time intervals, a challenge known as the multi-delay issue. Traditional models typically process all variables or time points uniformly, which limits their ability to capture complex variable relationships and obtain non-trivial time representations. To address this issue, we propose TimePro, an innovative Mamba-based model that constructs variate- and time-aware hyper-states. Unlike conventional approaches that merely transfer plain states across variable or time dimensions, TimePro preserves the fine-grained temporal features of each variate token and adaptively selects the focused time points to tune the plain state. The reconstructed hyper-state can perceive both variable relationships and salient temporal information, which helps the model make accurate forecasting. In experiments, TimePro performs competitively on eight real-world long-term forecasting benchmarks with satisfactory linear complexity. Code is available at https://github.com/xwmaxwma/TimePro.

TimePro: Efficient Multivariate Long-term Time Series Forecasting with Variable- and Time-Aware Hyper-state

TL;DR

TimePro tackles the multi-delay challenge in multivariate long-term forecasting by constructing variable- and time-aware hyper-states within a Mamba-based encoder. It combines reversible normalization, time- and variable-preserved patch embeddings, ProBlock stacks, and a HyperMamba module with Hyper-S scan to adaptively tune time points and model variable interactions, achieving complexity. The method delivers state-of-the-art results on eight real-world datasets with lower resource usage, supported by comprehensive ablations showing the value of adaptive time tuning and the HyperMamba design. This work offers a scalable, efficient approach for accurate long-horizon forecasting in high-dimensional time series and sets the stage for a large-time-series foundation model.

Abstract

In long-term time series forecasting, different variables often influence the target variable over distinct time intervals, a challenge known as the multi-delay issue. Traditional models typically process all variables or time points uniformly, which limits their ability to capture complex variable relationships and obtain non-trivial time representations. To address this issue, we propose TimePro, an innovative Mamba-based model that constructs variate- and time-aware hyper-states. Unlike conventional approaches that merely transfer plain states across variable or time dimensions, TimePro preserves the fine-grained temporal features of each variate token and adaptively selects the focused time points to tune the plain state. The reconstructed hyper-state can perceive both variable relationships and salient temporal information, which helps the model make accurate forecasting. In experiments, TimePro performs competitively on eight real-world long-term forecasting benchmarks with satisfactory linear complexity. Code is available at https://github.com/xwmaxwma/TimePro.

Paper Structure

This paper contains 34 sections, 14 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Forecasting performance comparison of TimePro with other state-of-the-art methods. Average results (MSE) are reported following iTransformer itransformer. Visualization results show that TimePro outperforms previous methods on the popular multivariate long-term forecasting benchmarks.
  • Figure 2: Efficiency comparison of TimePro with other state-of-the-art methods. We set the lookback window L = 96, forecast horizon H = 720, and batch size to 16 in the Electricity dataset. The train and inference times are measured on the Nvidia V100 GPU. Compared to other methods, TimePro achieves satisfactory performance with minimal parameters, FLOPs, memory consumption and competitive training and inference speeds.
  • Figure 3: Overview of our TimePro method. The multivariate time series is first embedded along the temporal dimension with the patching operation to get the series representation for each variable. Then the variable correlation and time representation of variables are captured by multiple layers of ProBlock modules. The core component of Problok is HyperMamba, which adaptively selects important time points to regulate the plain state of the variable dimension. The reconstructed time- and variable-aware hyper-states are then applied to obtain the output.
  • Figure 4: Implementation details of hardware-aware hyper-scan. We effectively apply the GPU memory hierarchy, i.e., perform plain state acquisition on GPU SRAM (implemented in the grey box above), and other operations on GPU HBM. Specifically, we follow the original Mamba implementation by first scanning the embedding along the variables and acquiring the plain state. Then, we perform a reshape on the plain state to recover the fine-grained time dimension of the embedding. Next, we adaptively select important time points for each variable to adjust the plain state to obtain time- and variable-aware hyper-states. Finally, the reconstructed hyper-states are applied to obtain the augmented embeddings through a gating mechanism.
  • Figure 5: Visualization for multivariate correlation analysis on ETTm1 (upper) and ETTh1(bottom) dataset. The visualization is implemented based on the Pearson Correlation Coefficient. The GT Correlations denote the correlation between the variables of the forecast sequence (groundtruth). The two columns on the right denote the correlation between the variables before and after the HyperMamba module, respectively. It shows that TimePro drives the learned multivariate correlations closer to the forecast sequence through the HyperMamba module.
  • ...and 6 more figures