Table of Contents
Fetching ...

Separation capacity of linear reservoirs with random connectivity matrix

Youness Boutaib

TL;DR

The paper addresses how well random linear reservoirs separate distinct input time series by linking separation to the spectral properties of generalized moment matrices derived from the random connectivity. It develops a rigorous framework showing that, for Gaussian W, the separation capacity is governed by the eigenstructure of $B_{T}$ in 1D and $B_{T,N}$ in higher dimensions, with detailed results for symmetric versus IID cases and precise scaling laws (notably $\sigma \sim 1/\sqrt{N}$). It provides both asymptotic spectral guarantees (via connections to the semicircle law in the symmetric case and related limits in the IID case) and probabilistic separation bounds (root-based and concentration-based), then discusses implications for reservoir design and task-performance, including how to balance separation against robustness as input length grows. The findings offer theoretical justification for empirical scaling heuristics and guide practical choices of reservoir size $N$, time horizon $T$, and connectivity scaling, while outlining open problems around eigenvector dynamics, nonlinearity effects, and optimization of hyperparameters.

Abstract

A natural hypothesis for the success of reservoir computing in generic tasks is the ability of the untrained reservoir to map different input time series to separable reservoir states - a property we term separation capacity. We provide a rigorous mathematical framework to quantify this capacity for random linear reservoirs, showing that it is fully characterised by the spectral properties of the generalised matrix of moments of the random reservoir connectivity matrix. Our analysis focuses on reservoirs with Gaussian connectivity matrices, both symmetric and i.i.d., although the techniques extend naturally to broader classes of random matrices. In the symmetric case, the generalised matrix of moments is a Hankel matrix. Using classical estimates from random matrix theory, we establish that separation capacity deteriorates over time and that, for short inputs, optimal separation in large reservoirs is achieved when the matrix entries are scaled with a factor $ρ_T/\sqrt{N}$, where $N$ is the reservoir dimension and $ρ_T$ depends on the maximum input length. In the i.i.d.\ case, we establish that optimal separation with large reservoirs is consistently achieved when the entries of the reservoir matrix are scaled with the exact factor $1/\sqrt{N}$, which aligns with common implementations of reservoir computing. We further give upper bounds on the quality of separation as a function of the length of the time series. We complement this analysis with an investigation of the likelihood of this separation and its consistency under different architectural choices.

Separation capacity of linear reservoirs with random connectivity matrix

TL;DR

The paper addresses how well random linear reservoirs separate distinct input time series by linking separation to the spectral properties of generalized moment matrices derived from the random connectivity. It develops a rigorous framework showing that, for Gaussian W, the separation capacity is governed by the eigenstructure of in 1D and in higher dimensions, with detailed results for symmetric versus IID cases and precise scaling laws (notably ). It provides both asymptotic spectral guarantees (via connections to the semicircle law in the symmetric case and related limits in the IID case) and probabilistic separation bounds (root-based and concentration-based), then discusses implications for reservoir design and task-performance, including how to balance separation against robustness as input length grows. The findings offer theoretical justification for empirical scaling heuristics and guide practical choices of reservoir size , time horizon , and connectivity scaling, while outlining open problems around eigenvector dynamics, nonlinearity effects, and optimization of hyperparameters.

Abstract

A natural hypothesis for the success of reservoir computing in generic tasks is the ability of the untrained reservoir to map different input time series to separable reservoir states - a property we term separation capacity. We provide a rigorous mathematical framework to quantify this capacity for random linear reservoirs, showing that it is fully characterised by the spectral properties of the generalised matrix of moments of the random reservoir connectivity matrix. Our analysis focuses on reservoirs with Gaussian connectivity matrices, both symmetric and i.i.d., although the techniques extend naturally to broader classes of random matrices. In the symmetric case, the generalised matrix of moments is a Hankel matrix. Using classical estimates from random matrix theory, we establish that separation capacity deteriorates over time and that, for short inputs, optimal separation in large reservoirs is achieved when the matrix entries are scaled with a factor , where is the reservoir dimension and depends on the maximum input length. In the i.i.d.\ case, we establish that optimal separation with large reservoirs is consistently achieved when the entries of the reservoir matrix are scaled with the exact factor , which aligns with common implementations of reservoir computing. We further give upper bounds on the quality of separation as a function of the length of the time series. We complement this analysis with an investigation of the likelihood of this separation and its consistency under different architectural choices.
Paper Structure (15 sections, 19 theorems, 197 equations, 11 figures)

This paper contains 15 sections, 19 theorems, 197 equations, 11 figures.

Key Result

Theorem 1

Consider a one-dimensional linear reservoir with random connectivity $w$, i.e. the output of the reservoir for the signal $\mathbf{x}:=(x_t)_{0\leq t \leq T}$ is given by Then the expected separation capacity of the reservoir is characterised by the eigenvalues of the symmetric positive semi-definite matrix More specifically, if $\lambda_{\mathrm{min}} (B_T)$ and $\lambda_{\mathrm{max}} (B_T)$ d

Figures (11)

  • Figure 1: The evolutions in time of the (logarithms of the) largest and smallest eigenvalues of the Hankel matrix of moments of a Gaussian random variable with standard deviation $\rho$.
  • Figure 2: The dominance of the largest eigenvalue over the entire spectrum (as defined in (\ref{['eq:DomRatio1']})), as a function of the length $T$, of the Hankel matrix of moments $B_T$ of a centred Gaussian random variable with variance $\rho^2$.
  • Figure 3: The evolutions in function of the dimension $N$ of the reservoir of the dominance ratio $r_{T,N}$ (as defined in (\ref{['eq:DomRatioN']})) of the generalised matrix of moments associated to a symmetric $N\times N$ random connectivity matrix with i.i.d. entries on and above the diagonal. These random variables are centred Gaussians with standard deviation $\rho=\frac{1}{N^{\alpha}}$. The different plots correspond to different lengths $T$ of the times-series. The different graphs in each plot correspond to different values of the scaling exponent $\alpha$.
  • Figure 4: The dominance $r_{T}$ of the largest eigenvalue, in function of the length $T$, over the entire spectrum of the Hankel matrix of moments $A_{\mathrm{sc}}(\rho,\cdot)$ of the rescaled Wigner semi-circle law with rescaling parameter $\rho$.
  • Figure 5: The evolutions in function of the length of the time series $T$ of the dominance ratio $r_{T,N}$ (as defined in (\ref{['eq:DomRatioN']})) of the generalised matrix of moments associated to a symmetric $N\times N$ random connectivity matrix with i.i.d. entries on and above the diagonal. These random variables are centred and have a standard deviation $\rho=\frac{1}{N^{\alpha}}$. The different plots correspond to different dimensions $N$ of the reservoir. The different graphs in each plot correspond to different values of the scaling exponent $\alpha$.
  • ...and 6 more figures

Theorems & Definitions (36)

  • Theorem 1
  • Proposition 2
  • Example 1
  • Example 2
  • Remark 3
  • Proposition 4
  • Theorem 5
  • Theorem 6
  • Definition 7
  • Proposition 8
  • ...and 26 more