Table of Contents
Fetching ...

Entrywise application of non-linear functions on orthogonally invariant matrices

Roland Speicher, Alexander Wendel

TL;DR

The paper develops a general framework for understanding how entrywise non-linear transformations of symmetric, orthogonally invariant random matrices affect the limiting eigenvalue distribution. Using a cumulant-based analysis and the Leonov–Shiryaev formula, it derives a Gaussian equivalence principle: the transformed matrix Y_N has the same asymptotic spectrum as a linear combination with an independent GOE, $ ilde{Y}_N= heta_1 X_N+ heta_2 Z_N$, where $g$ is Gaussian and $ heta_1=oldsymbol{b E}[f'(g)]$, $ heta_2^2=oldsymbol{b E}[f(g)^2]-oldsymbol{b E}[f(g)]^2- heta_1^2oldsymbol{ ext{Var}}(X_N)$. This leads to a free-convolution description of the spectrum and extends to multivariate inputs, with concrete examples such as ReLU and the maximum function, including correlated cases validated numerically. The results provide a broad, interpretable principle for non-linear entrywise perturbations in high-dimensional rotationally invariant models, with implications for understanding non-linear preprocessing in learning and physics-inspired matrix models.

Abstract

In this article, we investigate how the entrywise application of a non-linear function to symmetric orthogonally invariant random matrix ensembles alters the spectral distribution. We treat also the multivariate case where we apply multivariate functions to entries of several orthogonally invariant matrices; where even correlations between the matrices are allowed. We find that in all those cases a Gaussian equivalence principle holds, that is, the asymptotic effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE. The ReLU-function in the case of one matrix and the max-function in the case of two matrices provide illustrative examples.

Entrywise application of non-linear functions on orthogonally invariant matrices

TL;DR

The paper develops a general framework for understanding how entrywise non-linear transformations of symmetric, orthogonally invariant random matrices affect the limiting eigenvalue distribution. Using a cumulant-based analysis and the Leonov–Shiryaev formula, it derives a Gaussian equivalence principle: the transformed matrix Y_N has the same asymptotic spectrum as a linear combination with an independent GOE, , where is Gaussian and , . This leads to a free-convolution description of the spectrum and extends to multivariate inputs, with concrete examples such as ReLU and the maximum function, including correlated cases validated numerically. The results provide a broad, interpretable principle for non-linear entrywise perturbations in high-dimensional rotationally invariant models, with implications for understanding non-linear preprocessing in learning and physics-inspired matrix models.

Abstract

In this article, we investigate how the entrywise application of a non-linear function to symmetric orthogonally invariant random matrix ensembles alters the spectral distribution. We treat also the multivariate case where we apply multivariate functions to entries of several orthogonally invariant matrices; where even correlations between the matrices are allowed. We find that in all those cases a Gaussian equivalence principle holds, that is, the asymptotic effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE. The ReLU-function in the case of one matrix and the max-function in the case of two matrices provide illustrative examples.

Paper Structure

This paper contains 18 sections, 9 theorems, 111 equations, 4 figures.

Key Result

Proposition 1

Let $X_N=(x_{ij})_{i,j=1}^N$ be a symmetric orthogonally invariant random matrix ensemble which has a limit distribution of all orders (that is, all limits as in eq:limitsofcorrelations exist). Let $i_1, \dots i_n, j_1, \dots, j_n \in \left[ N \right]$. Then the following holds for the joint cumulan

Figures (4)

  • Figure 1: The effect of entrywise application of ReLU on $X_N$ from \ref{['eq:X_Neins']} and comparison to the Gaussian equivalent $\hat{Y}_N$ from \ref{['eq:Y_Nhateins']}; $N=5000$
  • Figure 2: The effect of entrywise application of max on $X_N^{(1)}$ and $X_N^{(2)}$ from \ref{['eq:XN1undX_2']}; $N=5000$
  • Figure 3: Superposition of the eigenvalue distributions of the non-linear matrix $Y_N$ and of its Gaussian equivalent $\hat{Y}_N$ from \ref{['eq:YNhatzwei']}; $N=5000$
  • Figure 4: Superposition of the eigenvalues of the non-linear matrix $Y_N$, given as the max of the correlated matrices $X_N^{(1)}$ and $X_N^{(2)}$ from \ref{['eq:XN1undXN2cor']}, and of its Gaussian equivalent $\hat{Y}_N$ from \ref{['eq:YNhatdrei']}; $N=5000$

Theorems & Definitions (25)

  • Proposition 1
  • Proposition 3
  • proof
  • Theorem 4
  • proof
  • Definition 5: $\mathcal{S}$-word
  • Remark 6
  • Definition 7: length, weight, support of a word
  • Definition 8: Graph associated with an $\mathcal{S}$-word
  • proof : Continuation of Proof
  • ...and 15 more