Temporal Convolution Derived Multi-Layered Reservoir Computing

Johannes Viehweg; Dominik Walther; Patrick Mäder

Temporal Convolution Derived Multi-Layered Reservoir Computing

Johannes Viehweg, Dominik Walther, Patrick Mäder

TL;DR

This work tackles time-series prediction, especially chaotic and history-dependent dynamics, where traditional RC suffers from randomness and variance across initializations. It introduces Temporal Convolution Derived Reservoir Computing (TCRC) and the TC-ELM variant, embedding temporal convolution-inspired state mappings into reservoir-like architectures to improve memory handling while reducing randomness. Across Mackey-Glass variants and the SantaFe Laser dataset, the proposed methods achieve substantial reductions in prediction error compared to established baselines such as ESN, GRU, NG-RC, and AEESN, with multi-layer TCRC offering notable gains for non-chaotic data and TCRC-ELM enhancing chaotic predictions through random projection. The results highlight improved parallelizability, memory retention, and robustness, while acknowledging limitations related to dataset scope and parameter sensitivity, pointing to future work on scaling and non-random mappings.

Abstract

The prediction of time series is a challenging task relevant in such diverse applications as analyzing financial data, forecasting flow dynamics or understanding biological processes. Especially chaotic time series that depend on a long history pose an exceptionally difficult problem. While machine learning has shown to be a promising approach for predicting such time series, it either demands long training time and much training data when using deep Recurrent Neural Networks. Alternative, when using a Reservoir Computing approach it comes with high uncertainty and typically a high number of random initializations and extensive hyper-parameter tuning. In this paper, we focus on the Reservoir Computing approach and propose a new mapping of input data into the reservoir's state space. Furthermore, we incorporate this method in two novel network architectures increasing parallelizability, depth and predictive capabilities of the neural network while reducing the dependence on randomness. For the evaluation, we approximate a set of time series from the Mackey-Glass equation, inhabiting non-chaotic as well as chaotic behavior as well as the SantaFe Laser dataset and compare our approaches in regard to their predictive capabilities to Echo State Networks, Autoencoder connected Echo State Networks and Gated Recurrent Units. For the chaotic time series, we observe an error reduction of up to $85.45\%$ compared to Echo State Networks and $90.72\%$ compared to Gated Recurrent Units. Furthermore, we also observe tremendous improvements for non-chaotic time series of up to $99.99\%$ in contrast to the existing approaches.

Temporal Convolution Derived Multi-Layered Reservoir Computing

TL;DR

Abstract

compared to Echo State Networks and

compared to Gated Recurrent Units. Furthermore, we also observe tremendous improvements for non-chaotic time series of up to

in contrast to the existing approaches.

Paper Structure (29 sections, 15 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 29 sections, 15 equations, 11 figures, 5 tables, 1 algorithm.

Introduction
Reservoir Computing
Echo State Networks
Extreme Learning Machines
Next Generation Reservoir Computing
Adapting Temporal Convolutions for Reservoir Computing
The Temporal Convolution Concept
Temporal Convolution Derived Reservoir Computing (TCRC)
TC-Extreme Layer (TCRC-ELM)
Evaluation and Results
Dataset
Training and Evaluation
Parametrization
Baselines
Echo State Network (ESN)
...and 14 more sections

Figures (11)

Figure 1: Basic architecture of the ESN, where green arrows ($\rightarrow$) refer to the randomly initialized set of connections from the input $x^{(t)}$ into the reservoir and red arrows ($\rightarrow$) refer to the trained mapping from the reservoir to the output $\hat{y}^{(t)}$
Figure 2: Single block of a TCN $\chi_i$, where red arrow-style edges ($\rightarrow$) refer to the information flow from the input through multiple layers of convolutions to each subsequent layer as well as from the last layer $L$ to the output $\hat{y}^{(t)}$.
Figure 3: Proposed TCRC architecture, where black arrows ($\rightarrow$) refer to the multiplied tokens and red arrows ($\rightarrow$) refer to the learned mapping from the state space of each layer ${}_ls^{(t)}$ to the output $\hat{y}^{(t)}$.
Figure 4: Proposed TCRC-ELM architecture, where black arrows ($\rightarrow$) refer to the multiplied tokens, green arrows ($\rightarrow$) refer to the randomly drawn weights of $W^{\mathrm{in}}$, and red arrows ($\rightarrow$) refer to the learned mapping from the state space $\hat{s}^{(t)}$ to the output $\hat{y}^{(t)}$. For the sake of simplicity we have not shown the optional learned connections from all ${}_j{s}^{(t)}$ to $\hat{y}^{(t)}$.
Figure 6: Exemplary visualization of the prediction and ground truth for $\tau=20$ for the TCRC
...and 6 more figures

Temporal Convolution Derived Multi-Layered Reservoir Computing

TL;DR

Abstract

Temporal Convolution Derived Multi-Layered Reservoir Computing

Authors

TL;DR

Abstract

Table of Contents

Figures (11)