StoxLSTM: A Stochastic Extended Long Short-Term Memory Network for Time Series Forecasting
Zihao Wang, Yunjie Li, Lingmin Zan, Zheng Gong, Mengtao Zhu
TL;DR
StoxLSTM extends xLSTM with stochastic latent dynamics inside a non-autoregressive state-space framework to model deep temporal dependencies and uncertainty in time series. It combines a generative model and a bidirectional inference model trained via a variational ELBO to learn latent trajectories that improve forecasting accuracy. Across nine long-term and four short-term benchmarks, StoxLSTM achieves state-of-the-art or near-state-of-the-art performance, showing robust gains over Transformer-, Linear-, and xLSTM-based baselines and demonstrating the value of explicit latent dynamics and SSM structure. Limitations include Gaussian latent assumptions and training overhead, with future work aimed at relaxing Gaussianity and improving efficiency while maintaining accuracy.
Abstract
The Extended Long Short-Term Memory (xLSTM) network has demonstrated strong capability in modeling complex long-term dependencies in time series data. Despite its success, the deterministic architecture of xLSTM limits its representational capacity and forecasting performance, especially on challenging real-world time series datasets characterized by inherent uncertainty, stochasticity, and complex hierarchical latent dynamics. In this work, we propose StoxLSTM, a stochastic xLSTM within a designed state space modeling framework, which integrates latent stochastic variables directly into the recurrent units to effectively model deep latent temporal dynamics and uncertainty. The designed state space model follows an efficient non-autoregressive generative approach, achieving strong predictive performance without complex modifications to the original xLSTM architecture. Extensive experiments on publicly available benchmark datasets demonstrate that StoxLSTM consistently outperforms state-of-the-art baselines, achieving superior performance and generalization.
