Table of Contents
Fetching ...

Parallel BiLSTM-Transformer networks for forecasting chaotic dynamics

Junwen Ma, Mingyu Ge, Yisen Wang, Yong Zhang, Weicheng Fu

TL;DR

This work tackles chaotic time-series forecasting by introducing a parallel BiLSTM-Transformer framework that separately models local temporal dynamics and global dependencies, then fuses their representations for prediction. Using time-delay embeddings on the Lorenz system, the method is evaluated on autonomous evolution and inference of unmeasured variables, demonstrating superior accuracy and stability compared with single-branch baselines. Key contributions include a three-layer Transformer encoder with self-attention, a BiLSTM branch for local features, and a simple yet effective feature-fusion scheme, all trained with MSE loss and standard optimization. The findings show that the hybrid approach extends predictive horizons in chaotic regimes and enables reliable reconstruction of unobserved states from partial observations, with broad implications for nonlinear dynamics modelling and practical applications in engineering and science.

Abstract

The nonlinear nature of chaotic systems results in extreme sensitivity to initial conditions and highly intricate dynamical behaviors, posing fundamental challenges for accurately predicting their evolution. To overcome the limitation that conventional approaches fail to capture both local features and global dependencies in chaotic time series simultaneously, this study proposes a parallel predictive framework integrating Transformer and Bidirectional Long Short-Term Memory (BiLSTM) networks. The hybrid model employs a dual-branch architecture, where the Transformer branch mainly captures long-range dependencies while the BiLSTM branch focuses on extracting local temporal features. The complementary representations from the two branches are fused in a dedicated feature-fusion layer to enhance predictive accuracy. As illustrating examples, the model's performance is systematically evaluated on two representative tasks in the Lorenz system. The first is autonomous evolution prediction, in which the model recursively extrapolates system trajectories from the time-delay embeddings of the state vector to evaluate long-term tracking accuracy and stability. The second is inference of unmeasured variable, where the model reconstructs the unobserved states from the time-delay embeddings of partial observations to assess its state-completion capability. The results consistently indicate that the proposed hybrid framework outperforms both single-branch architectures across tasks, demonstrating its robustness and effectiveness in chaotic system prediction.

Parallel BiLSTM-Transformer networks for forecasting chaotic dynamics

TL;DR

This work tackles chaotic time-series forecasting by introducing a parallel BiLSTM-Transformer framework that separately models local temporal dynamics and global dependencies, then fuses their representations for prediction. Using time-delay embeddings on the Lorenz system, the method is evaluated on autonomous evolution and inference of unmeasured variables, demonstrating superior accuracy and stability compared with single-branch baselines. Key contributions include a three-layer Transformer encoder with self-attention, a BiLSTM branch for local features, and a simple yet effective feature-fusion scheme, all trained with MSE loss and standard optimization. The findings show that the hybrid approach extends predictive horizons in chaotic regimes and enables reliable reconstruction of unobserved states from partial observations, with broad implications for nonlinear dynamics modelling and practical applications in engineering and science.

Abstract

The nonlinear nature of chaotic systems results in extreme sensitivity to initial conditions and highly intricate dynamical behaviors, posing fundamental challenges for accurately predicting their evolution. To overcome the limitation that conventional approaches fail to capture both local features and global dependencies in chaotic time series simultaneously, this study proposes a parallel predictive framework integrating Transformer and Bidirectional Long Short-Term Memory (BiLSTM) networks. The hybrid model employs a dual-branch architecture, where the Transformer branch mainly captures long-range dependencies while the BiLSTM branch focuses on extracting local temporal features. The complementary representations from the two branches are fused in a dedicated feature-fusion layer to enhance predictive accuracy. As illustrating examples, the model's performance is systematically evaluated on two representative tasks in the Lorenz system. The first is autonomous evolution prediction, in which the model recursively extrapolates system trajectories from the time-delay embeddings of the state vector to evaluate long-term tracking accuracy and stability. The second is inference of unmeasured variable, where the model reconstructs the unobserved states from the time-delay embeddings of partial observations to assess its state-completion capability. The results consistently indicate that the proposed hybrid framework outperforms both single-branch architectures across tasks, demonstrating its robustness and effectiveness in chaotic system prediction.

Paper Structure

This paper contains 10 sections, 21 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Structure of the proposed dual branch model, comprising a BiLSTM branch and a Transformer branch. The outputs of the BiLSTM and Transformer branches are merged through element-wise addition, and the fused representation is subsequently passed to a dense layer for final prediction.
  • Figure 2: The prediction results of different models on the autonomous evolution of the Lorenz system. (a)-(c) correspond respectively to the results of state variables $x$, $y$, and $z$. (d) Evolution of NRMSE of the three models. The intersection point of the horizontal solid line and the curve (indicated by the cross) determines the VPT. All panels share the legend.
  • Figure 3: The violin plot shows the distribution of $t_{\rm VPT}$ given by different models under 100 random initializations, where the red dots and blue stars correspond to the median and mean, respectively.
  • Figure 4: Inferring results of $y$ (b) and $z$ (c) using $x$ (a) as the observed variable, i.e., $\boldsymbol{m}=x$ and $\boldsymbol{u}=[y,z]$ in Eq. (\ref{['eq-infer-E']}). (d) The evolution of RMSE corresponding to different models, where the solid line represents their cumulative average. Panels (b)-(d) share the legend.
  • Figure 5: Inferring results of $x$ (b) and $z$ (c) using $y$ (a) as the observed variable. (d) The evolution of RMSE and $\overline{\rm RMSE}$.
  • ...and 2 more figures