ISMRNN: An Implicitly Segmented RNN Method with Mamba for Long-Term Time Series Forecasting
GaoXiang Zhao, Li Zhou, XiaoQiang Wang
TL;DR
ISMRNN tackles long-term time series forecasting by addressing information loss and gradient issues inherent to RNNs, building on SegRNN with three enhancements: implicit segmentation for denser information exchange, a residual encoding path to bypass part of the recurrent flow, and Mamba-based preprocessing to better extract temporal structure. The method shows superior performance across six real-world datasets, achieving top results on a majority of MSE/MAE metrics and demonstrating notable gains especially at shorter look-back windows. Ablation confirms a synergistic effect among the three components, while efficiency remains competitive with non-RNN baselines. Overall, ISMRNN offers a practical, scalable approach to long-horizon forecasting with improved information flow and robustness to horizon length.
Abstract
Long time series forecasting aims to utilize historical information to forecast future states over extended horizons. Traditional RNN-based series forecasting methods struggle to effectively address long-term dependencies and gradient issues in long time series problems. Recently, SegRNN has emerged as a leading RNN-based model tailored for long-term series forecasting, demonstrating state-of-the-art performance while maintaining a streamlined architecture through innovative segmentation and parallel decoding techniques. Nevertheless, SegRNN has several limitations: its fixed segmentation disrupts data continuity and fails to effectively leverage information across different segments, the segmentation strategy employed by SegRNN does not fundamentally address the issue of information loss within the recurrent structure. To address these issues, we propose the ISMRNN method with three key enhancements: we introduce an implicit segmentation structure to decompose the time series and map it to segmented hidden states, resulting in denser information exchange during the segmentation phase. Additionally, we incorporate residual structures in the encoding layer to mitigate information loss within the recurrent structure. To extract information more effectively, we further integrate the Mamba architecture to enhance time series information extraction. Experiments on several real-world long time series forecasting datasets demonstrate that our model surpasses the performance of current state-of-the-art models.
