Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Md Kowsher; Abdul Rafae Khan; Jia Xu

Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Md Kowsher, Abdul Rafae Khan, Jia Xu

Abstract

In Chaos, a minor divergence between two initial conditions exhibits exponential amplification over time, leading to far-away outcomes, known as the butterfly effect. Thus, the distant future is full of uncertainty and hard to forecast. We introduce Group Reservoir Transformer to predict long-term events more accurately and robustly by overcoming two challenges in Chaos: (1) the extensive historical sequences and (2) the sensitivity to initial conditions. A reservoir is attached to a Transformer to efficiently handle arbitrarily long historical lengths, with an extension of a group of reservoirs to reduce the sensitivity to the initialization variations. Our architecture consistently outperforms state-of-the-art models in multivariate time series, including TimeLLM, GPT2TS, PatchTST, DLinear, TimeNet, and the baseline Transformer, with an error reduction of up to -59\% in various fields such as ETTh, ETTm, and air quality, demonstrating that an ensemble of butterfly learning can improve the adequacy and certainty of event prediction, despite of the traveling time to the unknown future.

Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Abstract

Paper Structure (26 sections, 11 equations, 9 figures, 13 tables, 1 algorithm)

This paper contains 26 sections, 11 equations, 9 figures, 13 tables, 1 algorithm.

Introduction
Background
Method
Cross-attention for Embedding
Deep Reservoir Computing
Linear Readout
Nonlinear Readout
Group Reservoir
Group Reservoir Readout with Self-Attention
Group Reservoir Transformer (RT)
Training
Experiments
Ablation Analysis
Related Work
Conclusion
...and 11 more sections

Figures (9)

Figure 1: Left: Group Reservoir Transformer; Right: Our RT prediction is closer to the ground truth than the Transformer in Lorenz Extractor.
Figure 3: MSE vs. reservoir numbers.
Figure 4: RT outperforms baselines with small parameter size and loss in training.
Figure 5: Look back window size vs loss
Figure 6: The Lime Explanation effect of current input (A) and reservoir (B) on output prediction is shown. The test dataset is represented by ETTh1, ETTh2, ETTm1, and ETTm2. The results indicate that current input has a greater impact on feature 6, while the reservoir affects multiple features such as 20, 13, 29, and 7.
...and 4 more figures

Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Abstract

Changes by Butterflies: Farsighted Forecasting with Group Reservoir Transformer

Authors

Abstract

Table of Contents

Figures (9)