Table of Contents
Fetching ...

Spiky Rank and Its Applications to Rigidity and Circuits

Lianna Hambardzumyan, Konstantin Myasnikov, Artur Riazanov, Morgan Shirley, Adi Shraibman

TL;DR

This work proposes spiky rank as a well-behaved candidate matrix complexity measure and shows that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks.

Abstract

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the $γ_2$-norm.

Spiky Rank and Its Applications to Rigidity and Circuits

TL;DR

This work proposes spiky rank as a well-behaved candidate matrix complexity measure and shows that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks.

Abstract

We introduce spiky rank, a new matrix parameter that enhances blocky rank by combining the combinatorial structure of the latter with linear-algebraic flexibility. A spiky matrix is block-structured with diagonal blocks that are arbitrary rank-one matrices, and the spiky rank of a matrix is the minimum number of such matrices required to express it as a sum. This measure extends blocky rank to real matrices and is more robust for problems with both combinatorial and algebraic character. Our conceptual contribution is as follows: we propose spiky rank as a well-behaved candidate matrix complexity measure and demonstrate its potential through applications. We show that large spiky rank implies high matrix rigidity, and that spiky rank lower bounds yield lower bounds for depth-2 ReLU circuits, the basic building blocks of neural networks. On the technical side, we establish tight bounds for random matrices and develop a framework for explicit lower bounds, applying it to Hamming distance matrices and spectral expanders. Finally, we relate spiky rank to other matrix parameters, including blocky rank, sparsity, and the -norm.
Paper Structure (39 sections, 39 theorems, 69 equations, 5 figures, 1 algorithm)

This paper contains 39 sections, 39 theorems, 69 equations, 5 figures, 1 algorithm.

Key Result

Theorem 1.1

Let $M$ be a matrix and $0 < r \leq \mathsf{spr}(M)$. Then,

Figures (5)

  • Figure 1: A spiky matrix is the entrywise product of a blocky matrix and a rank-one matrix.
  • Figure 2: The identity matrix $I_n$ (left) and the diagonal matrix $D _n$ (right).
  • Figure 3: The picture illustrates the upper and lower bounds for spiky rank: above the scale there are known lower bounds and upper bounds. Below the scale are consequences of the corresponding lower bounds. The dashed part of the scale represents the state of the art of explicit lower bounds. Here $N$ is always the size of the matrix, so for $\mathrm{HD}_1^{n}$, $\textsc{Disj}_n$, and $\mathrm{IP}_n$, $N = 2^n$.
  • Figure 4: The picture illustrates the proof for the matrix $\mathrm{HD}_1^{5}$. The parts $T_i$ are colored red, $H_i$ are colored green. By \ref{['item:thin']} not too many $1$-cells (black ones) are covered by the small blocks, so by \ref{['item:permutation']} we find a permutation submatrix $S \times T$ with very few covered $1$-cells, and then further shrink it using \ref{['lem:non-intersecting-subset']} to $A \times B$ that does not contain any red cells at all. Then the rank of $\mathrm{HD}_1^{5}|_{A \times B}$ is maximal, but the total rank of large blocks is small, which is a contradiction.
  • Figure 5: The picture illustrates the process for $v=(1100)$, so $P = \{1,2\}$ and $Z = \{3,4\}$. Here the green matrices illustrate $R_{1}$ and $R_2$, the red matrices are $R_3$ and $R_4$, the hatched area in the right square is the support of the matrix $M{\uparrow}v$.

Theorems & Definitions (75)

  • Theorem 1.1
  • Proposition 1.4
  • Theorem 1.6: Informal
  • Theorem 1.7: Informal
  • Theorem 1.8
  • Theorem 1.10
  • Theorem 1.13
  • Theorem 1.14
  • Definition 2.1
  • Definition 2.2
  • ...and 65 more