Table of Contents
Fetching ...

Structural and dynamical strategies to prevent runaway excitation in reservoir computing

Claus Metzner, Achim Schilling, Andreas Maier, Thomas Kinfe, Patrick Krauss

Abstract

Reservoirs, typically implemented as recurrent neural networks with fixed random connection weights, can be combined with a simple trained readout layer to perform a wide range of computational tasks. However, increasing the magnitude of reservoir connection weights to exploit nonlinear dynamics can cause the network to develop strong spontaneous activity that drives neurons into saturation, dramatically degrading performance. In this work, we investigate two distinct countermeasures against such runaway excitation. The first approach introduces a subtle non-homogeneous structure into the matrix of connection weigths $w_{ij}$, without altering the overall probability distribution $p(w)$. We identify several favorable structuring principles, such as creating a small subset of neurons with weaker-than-average input connections. Even if the rest of the reservoir falls into runaway saturating behavior, this weakly coupled subset remains in a mildly nonlinear regime whose dynamics can still be exploited by the readout layer. The second approach implements a form of automatic gain control, in which a dedicated control unit dynamically regulates the reservoir's average global activation toward an optimal setpoint. Although the control unit modulates the excitability of the reservoir only via a global gain factor, this mechanism substantially enlarges the dynamical regime favorable for computation and renders performance largely independent of the underlying connection statistics.

Structural and dynamical strategies to prevent runaway excitation in reservoir computing

Abstract

Reservoirs, typically implemented as recurrent neural networks with fixed random connection weights, can be combined with a simple trained readout layer to perform a wide range of computational tasks. However, increasing the magnitude of reservoir connection weights to exploit nonlinear dynamics can cause the network to develop strong spontaneous activity that drives neurons into saturation, dramatically degrading performance. In this work, we investigate two distinct countermeasures against such runaway excitation. The first approach introduces a subtle non-homogeneous structure into the matrix of connection weigths , without altering the overall probability distribution . We identify several favorable structuring principles, such as creating a small subset of neurons with weaker-than-average input connections. Even if the rest of the reservoir falls into runaway saturating behavior, this weakly coupled subset remains in a mildly nonlinear regime whose dynamics can still be exploited by the readout layer. The second approach implements a form of automatic gain control, in which a dedicated control unit dynamically regulates the reservoir's average global activation toward an optimal setpoint. Although the control unit modulates the excitability of the reservoir only via a global gain factor, this mechanism substantially enlarges the dynamical regime favorable for computation and renders performance largely independent of the underlying connection statistics.

Paper Structure

This paper contains 53 sections, 18 equations, 4 figures.

Figures (4)

  • Figure 1: Overview. •$\;$(a) Reservoir computer consisting of a recurrent neural network (RNN with $N$ neurons, central circle with blue dots), an input layer (left box with green dots), and a readout layer (right box with orange dots). A control unit for dynamic gain regulation can optionally be placed on top of the RNN. The colored grids symbolize sequences of input and output vectors. •$\;$(b) Three direct plots of neural activations (color coded) over time in a densely connected RNN of 50 neurons with tanh activation functions. Because the coupling strength is large ($w\!=\!1$), the system develops global oscillations for strongly negative balance parameters such as $b\!=\!-0.9$, chaotic fluctuations for highly balanced systems such as $b\!=\!0$, and global fixed points for strongly positive balance parameters such as $b\!=\!+0.9$. All these attractor regimes are detrimental to task-related information processing. •$\;$(c) The Global Performance (GP, color coded) measure, defined as the accuracy averaged over nine bias values spanning the full range from $-1$ to $+1$, evaluated for $3\times5$ different types of permutative matrix structuring that all preserve the global weight distribution (compare main text and \ref{['fig2']}). The best GP is obtained when 20 percent of the rows in the weight matrix (representing neural inputs) have a weaker-than-average magnitude.
  • Figure 2: Structuring of the weight matrix. •$\;$(a) Three examples of GP-enhancing types of permutative matrix structuring, shown for three different balance values $b$. In each case, the full reservoir connection matrix is displayed with color-coded weights. •$\;$(b) Four examples of permutative matrix structuring that do not lead to an improvement in global performance. •$\;$(c) Probability distributions of connection weights for four different types of permutative matrix structuring. The distributions remain unchanged. The four curves have been slightly shifted vertically to improve visibility.
  • Figure 3: Principal Component Analysis (PCA) of reservoir activations.. Each of the $5\times3$ plots represents the lowest eight principal components of the reservoir activation states as a function of time. The magnitudes of these components differ greatly. Since the readout layer is insensitive to absolute magnitude, we normalize the values within the observation window to the full range from $-1$ to $+1$ to improve visibility. •$\;$ The first column of plots shows the homogeneous (non-structured) system. In the range of strongly negative bias ($b\!=\!-1$ and $b\!=\!-0.9$), all PCA components reflect the reservoir's global oscillations with a period of two. In the balanced case ($b\!=\!0$), the components fluctuate chaotically. In the range of strongly positive bias ($b\!=\!+0.9$ and $b\!=\!+1$), where the reservoir resides in a global fixed-point attractor, all PCA components are zero because the mean is subtracted in the PCA analysis. •$\;$ The second column of plots shows the structured system with some rows of reduced magnitude in the weight matrix. In the balanced case ($b\!=\!0$), we again observe chaotic behavior. However, in all strongly unbalanced cases ($b\!=\!-1$, $b\!=\!-0.9$, $b\!=\!+0.9$ and $b\!=\!+1$), the PCA components exhibit much richer dynamics. Some display a quasi-periodic pattern with a period of three, corresponding to the length of each input episode. These components therefore encode input-related information that can be exploited by the readout layer. •$\;$ The third column of plots shows the system equipped with a unit for automatic gain control. Here, the information encoded in the PCA components is even more diverse and more strongly aligned with the pulses of input.
  • Figure 4: Scan of dynamical properties and accuracy over the excitatory/inhibitory balance parameter $b$. •$\;$(a) Non-structured system with the coupling strength reduced to $w\!=\!0.1$. Even in this weakly coupled reservoir, three distinct dynamical regimes appear. In the globally oscillating regime at strongly negative $b$, the fluctuation $F$ (blue), the nonlinearity $N$ (red), and the instantaneous covariance $C_0$ (orange) approach the value $+1$, whereas $C_1$ (green), the covariance at lag time one, approaches $-1$. In the calm regime around $b\!=\!0$, fluctuations $F$ are small, the nonlinearity parameter $N$ is close to $-1$ (indicating that neurons operate in the linear regime of the activation function), and both covariances are close to zero. In this calm regime, the reservoir dynamics are mainly determined by the external input signals. In the global fixed-point regime at strongly positive $b$, the nonlinearity $N$ and both covariances $C_0$ and $C_1$ approach $+1$, whereas the fluctuation $F$ becomes zero. Although the neurons are saturated in the oscillatory and fixed-point regimes, the accuracy $A$ remains close to $1$ for all balances $b$, due to the weak coupling strength $w\!=\!0.1$. In weakly coupled reservoirs, information-processing activity can "ride on top" of these large activations, as we have shown previously. •$\;$(b) Non-structured system with the coupling strength increased to $w\!=\!1$. Within the oscillatory and fixed-point regimes, the dynamical properties $F$, $C_0$, $C_1$, and $N$ behave qualitatively as in the weakly coupled reservoir. However, in the balanced system around $b\!=\!0$, the calm system state is now replaced by high-amplitude chaotic spontaneous fluctuations. This leads to vanishing covariances $C_0\!=\! C_1\!=\!0$ and values close to one for the fluctuation $F$ and the nonlinearity $N$ (indicating operation in the saturation regime of the activation function). The accuracy $A$ is now close to chance level $A\!=\!0.5$ in the oscillatory, chaotic, and fixed-point regimes. Only at the two "edges of chaos" does the accuracy rise to levels of good performance. •$\;$(c) Structured system with 20 percent weak rows. Here, the dynamical quantities appear qualitatively similar to case (b), but in the chaotic regime the fluctuation $F$ and the nonlinearity $N$ are significantly reduced. Most importantly, however, the accuracy $A$ is now close to one for all balances $b$ except in the chaotic regime, where it remains only slightly above chance level. •$\;$(d) System with automatic gain control. Here, the nonlinearity parameter $N$ is relatively close to $-1$, indicating predominantly linear operation of the neurons. Fluctuations $F$ are small and mainly input-driven. The covariances are very small. Together, the dynamical quantities resemble those of the weakly coupled reservoir in panel (a) within the calm regime around $b\!=\!0$, yet this calm state now extends over the full range of balance values between $-1$ and $+1$. The accuracy is also close to perfect across this full balance range.