Table of Contents
Fetching ...

Efficient Time-Series Approximation with Linear Recurrent Neural Networks: Architecture Learning and Predictive Power

Frieder Stolzenburg, Sandra Litz, Olivia Michael, Oliver Obst

TL;DR

LRNNs introduce autoregressive linear recurrent neural networks with linear activation, enabling exact learning by solving a linear system and one-step architecture reduction via eigen-spectrum analysis of the transition matrix. They can approximate any time-dependent function $f(t)$ and exhibit ellipse-like long-term dynamics governed by dominant eigenmodes, enabling compact models with strong predictive power. The approach delivers fast, stable training without backpropagation and yields interpretable, low-dimensional dynamical structure suitable for time-series domains such as finance and robotics. Across MSO, RoboCup, and puzzles, LRNNs achieve competitive or superior results with far fewer units, highlighting practical advantages in speed, compression, and interpretability.

Abstract

Recurrent neural networks are a powerful means to cope with time series. We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t). The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore, and this is the main contribution of this paper, the size of an LRNN can be reduced significantly in one step after inspecting the spectrum of the network transition matrix, i.e., its eigenvalues, by taking only the most relevant components. Therefore, in contrast to other approaches, we do not only learn network weights but also the network architecture. LRNNs have interesting properties: They end up in ellipse trajectories in the long run and allow the prediction of further values and compact representations of functions. We demonstrate this by several case studies, among them multiple superimposed oscillators (MSO), robotic soccer (RoboCup), and stock price prediction. LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.

Efficient Time-Series Approximation with Linear Recurrent Neural Networks: Architecture Learning and Predictive Power

TL;DR

LRNNs introduce autoregressive linear recurrent neural networks with linear activation, enabling exact learning by solving a linear system and one-step architecture reduction via eigen-spectrum analysis of the transition matrix. They can approximate any time-dependent function and exhibit ellipse-like long-term dynamics governed by dominant eigenmodes, enabling compact models with strong predictive power. The approach delivers fast, stable training without backpropagation and yields interpretable, low-dimensional dynamical structure suitable for time-series domains such as finance and robotics. Across MSO, RoboCup, and puzzles, LRNNs achieve competitive or superior results with far fewer units, highlighting practical advantages in speed, compression, and interpretability.

Abstract

Recurrent neural networks are a powerful means to cope with time series. We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t). The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore, and this is the main contribution of this paper, the size of an LRNN can be reduced significantly in one step after inspecting the spectrum of the network transition matrix, i.e., its eigenvalues, by taking only the most relevant components. Therefore, in contrast to other approaches, we do not only learn network weights but also the network architecture. LRNNs have interesting properties: They end up in ellipse trajectories in the long run and allow the prediction of further values and compact representations of functions. We demonstrate this by several case studies, among them multiple superimposed oscillators (MSO), robotic soccer (RoboCup), and stock price prediction. LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.

Paper Structure

This paper contains 27 sections, 34 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: General recurrent neural network. In ESNs, only output weights are trained and the hidden layer is also called reservoir.
  • Figure 2: LRNNs for (a)$f(t) = t^2$ and (b + c) the Fibonacci series ($0,1,1,2,3,5,8,\dots$) with time step $\tau=1$. In each case, the input/output neuron $x_1$ is marked by a double circle. The initial values of the neurons at time $t_0=0$ are written in the nodes. The weights are annotated at the edges.
  • Figure 3: Dynamic system behavior of a pure random reservoir with unit spectral radius, with $N^\mathrm{res} = 100$ neurons: (a) Eigenvalue spectrum of the reservoir matrix $W^\mathrm{res}$ with complex conjugate eigenvalue pairs in the complex plane. (b) Visualization of $f(t)$ by planar projection. In the long run, we get an ellipse trajectory, thus only two dimensions (cf. \ref{['twodim']}). (c) Projected to one (arbitrary) dimension, we have pure sinusoids with one single angular frequency for large $t$, sampled in large steps.
  • Figure 4: Asymptotic behavior of pure random reservoirs with unit spectral radius: The (Euclidean) distance between the actual value $f(t)$ (according to \ref{['form']}) and its approximation by the final ellipse trajectory (\ref{['twodim']}) is almost zero already after a few hundred steps. The figure shows the distances for $N^\mathrm{res}=100$ (solid/blue), $N^\mathrm{res}=500$ (dashed/red), and $N^\mathrm{res}=1000$ (dotted/black) random reservoir neurons, starting with a random vector of unit length, averaged over 1000 trials.
  • Figure 5: Pseudocode for learning LRNNs including network size reduction. A binary search algorithm is employed for determining the relevant network components with smallest errors. For this, the network components are sorted by their RMSE. The program returns the number $M$ of relevant components in the Jordan matrix $J$ (line 24). The subroutine $\mathrm{Error}(J_I)$ (lines 25-31) computes the error of the predicted output for the Jordan matrix reduced to the components indexed by $I$.
  • ...and 4 more figures

Theorems & Definitions (20)

  • Definition 1: time series
  • Definition 2: recurrent neural network
  • Definition 3: linear recurrent neural network
  • Example 1
  • Example 2
  • proof
  • proof
  • Example 3
  • Remark 1
  • proof
  • ...and 10 more