Table of Contents
Fetching ...

StoxLSTM: A Stochastic Extended Long Short-Term Memory Network for Time Series Forecasting

Zihao Wang, Yunjie Li, Lingmin Zan, Zheng Gong, Mengtao Zhu

TL;DR

StoxLSTM extends xLSTM with stochastic latent dynamics inside a non-autoregressive state-space framework to model deep temporal dependencies and uncertainty in time series. It combines a generative model and a bidirectional inference model trained via a variational ELBO to learn latent trajectories that improve forecasting accuracy. Across nine long-term and four short-term benchmarks, StoxLSTM achieves state-of-the-art or near-state-of-the-art performance, showing robust gains over Transformer-, Linear-, and xLSTM-based baselines and demonstrating the value of explicit latent dynamics and SSM structure. Limitations include Gaussian latent assumptions and training overhead, with future work aimed at relaxing Gaussianity and improving efficiency while maintaining accuracy.

Abstract

The Extended Long Short-Term Memory (xLSTM) network has demonstrated strong capability in modeling complex long-term dependencies in time series data. Despite its success, the deterministic architecture of xLSTM limits its representational capacity and forecasting performance, especially on challenging real-world time series datasets characterized by inherent uncertainty, stochasticity, and complex hierarchical latent dynamics. In this work, we propose StoxLSTM, a stochastic xLSTM within a designed state space modeling framework, which integrates latent stochastic variables directly into the recurrent units to effectively model deep latent temporal dynamics and uncertainty. The designed state space model follows an efficient non-autoregressive generative approach, achieving strong predictive performance without complex modifications to the original xLSTM architecture. Extensive experiments on publicly available benchmark datasets demonstrate that StoxLSTM consistently outperforms state-of-the-art baselines, achieving superior performance and generalization.

StoxLSTM: A Stochastic Extended Long Short-Term Memory Network for Time Series Forecasting

TL;DR

StoxLSTM extends xLSTM with stochastic latent dynamics inside a non-autoregressive state-space framework to model deep temporal dependencies and uncertainty in time series. It combines a generative model and a bidirectional inference model trained via a variational ELBO to learn latent trajectories that improve forecasting accuracy. Across nine long-term and four short-term benchmarks, StoxLSTM achieves state-of-the-art or near-state-of-the-art performance, showing robust gains over Transformer-, Linear-, and xLSTM-based baselines and demonstrating the value of explicit latent dynamics and SSM structure. Limitations include Gaussian latent assumptions and training overhead, with future work aimed at relaxing Gaussianity and improving efficiency while maintaining accuracy.

Abstract

The Extended Long Short-Term Memory (xLSTM) network has demonstrated strong capability in modeling complex long-term dependencies in time series data. Despite its success, the deterministic architecture of xLSTM limits its representational capacity and forecasting performance, especially on challenging real-world time series datasets characterized by inherent uncertainty, stochasticity, and complex hierarchical latent dynamics. In this work, we propose StoxLSTM, a stochastic xLSTM within a designed state space modeling framework, which integrates latent stochastic variables directly into the recurrent units to effectively model deep latent temporal dynamics and uncertainty. The designed state space model follows an efficient non-autoregressive generative approach, achieving strong predictive performance without complex modifications to the original xLSTM architecture. Extensive experiments on publicly available benchmark datasets demonstrate that StoxLSTM consistently outperforms state-of-the-art baselines, achieving superior performance and generalization.

Paper Structure

This paper contains 27 sections, 17 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: The recurrent unit of StoxLSTM. StoxLSTM integrates stochastic latent variables based on the original xLSTM to represent more complex hierarchical and stochastic characteristics in time series data. The xLSTM consists of two sub-blocks: sLSTM and mLSTM. (a) shows the stochasticized sLSTM block, referred to as StosLSTM, while (b) shows the stochasticized mLSTM block, called StomLSTM. In the figure, diamond shapes denote deterministic variables, such as $\bm{n}_t$, $\bm{c}_t$, and $\bm{h}_t$ from the original xLSTM. Circles indicate stochastic variables, including the latent variable $\bm{z}_t$ obtained through the reparameterization module, as well as the observed time series variable $x_t$.
  • Figure 2: State space models corresponding to the generative model and the inference models in StoxLSTM. (a) and (b) illustrate the reconstruction and forecasting phase of the generative model, respectively. In (a), the state space model depicts the observation $x_t$ following the conditional distribution $p_{\bm\theta}(x_t \mid \bm{z}_t, x_{1:t-1})$, while the latent state $\bm{z}_t$ evolves according to $p_{\bm\theta}(\bm{z}_t\mid\bm{z}_{t-1}, x_{1:t-1})$. (b) represents the forecasting phase, where $x_t$ is generated from $p_{\bm\theta}(x_t\mid \bm{z}_t, x_{1:L})$, and the latent state $\bm{z}_t$ transitions according to $p_{\bm\theta}(\bm{z}_t \mid \bm{z}_{t-1}, x_{1:L})$. (c) illustrates the inference model, in which the latent state $\bm{z}_t$ is inferred from the approximate posterior distribution $q_\phi(\bm{z}_t \mid \bm{z}_{t-1}, x_{1:L+T})$.
  • Figure 3: Overall framework of StoxLSTM. (a) depicts the generative model, while (b) illustrates the inference model. In both (a) and (b), the dashed boxes represent the stacked recurrent units of StoxLSTM shown in (c). Each stacked unit in (c) corresponds to either a StomLSTM or a StosLSTM block.
  • Figure 4: Visualization of the prediction results from StoxLSTM with different prediction horizons H={48, 96, 192, 336} from the Traffic dataset. The orange line denotes the prediction values generated by StoxLSTM, while the blue line represents the ground truth.
  • Figure 5: Visualization of the prediction results from StoxLSTM with different prediction horizons H={96, 192, 336, 720} from the Electricity dataset. The orange line denotes the prediction values generated by StoxLSTM, while the blue line represents the ground truth.
  • ...and 9 more figures