Activation degree thresholds and expressiveness of polynomial neural networks

Bella Finkel; Jose Israel Rodriguez; Chenxi Wu; Thomas Yahl

Activation degree thresholds and expressiveness of polynomial neural networks

Bella Finkel, Jose Israel Rodriguez, Chenxi Wu, Thomas Yahl

TL;DR

The paper develops an algebro-geometric framework to quantify the expressiveness of polynomial neural networks via neurovarieties. It proves the activation degree threshold exists for architectures with no width-one bottlenecks and provides a universal bound $\mathrm{ActThr}(\mathbf{d}) \le 6m^2-6m+1$, resolving the high-degree conjecture and showing equi-width networks have $\mathrm{ActThr}=1$. The authors establish a fiber-based approach to relate the dimension of the neurovariety to its expected dimension for large activation degrees, and they extend key bounds from univariate to multivariate polynomial settings. They also prove that equi-width networks are maximally expressive in the sense that their neurovarieties achieve the expected dimension for all $r>1$, motivating architecture design guided by algebro-geometric expressiveness metrics.

Abstract

We study the expressive power of deep polynomial neural networks through the geometry of their neurovariety. We introduce the notion of the activation degree threshold of a network architecture to express when the dimension of the neurovariety achieves its theoretical maximum. We prove the existence of the activation degree threshold for all polynomial neural networks without width-one bottlenecks and demonstrate a universal upper bound that is quadratic in the width of largest size. In doing so, we prove the high activation degree conjecture of Kileel, Trager, and Bruna. Certain structured architectures have exceptional activation degree thresholds, making them especially expressive in the sense of their neurovariety dimension. In this direction, we prove that polynomial neural networks with equi-width architectures are maximally expressive by showing their activation degree threshold is one.

Activation degree thresholds and expressiveness of polynomial neural networks

TL;DR

, resolving the high-degree conjecture and showing equi-width networks have

. The authors establish a fiber-based approach to relate the dimension of the neurovariety to its expected dimension for large activation degrees, and they extend key bounds from univariate to multivariate polynomial settings. They also prove that equi-width networks are maximally expressive in the sense that their neurovarieties achieve the expected dimension for all

, motivating architecture design guided by algebro-geometric expressiveness metrics.

Abstract

Paper Structure (9 sections, 11 theorems, 70 equations, 1 figure)

This paper contains 9 sections, 11 theorems, 70 equations, 1 figure.

Polynomial neural networks and neurovarieties
Background
Main result and example
Expressiveness for high activation degree
A prior number theoretic result by Newman--Slater
Powers of non-proportional multivariate polynomials
Deep networks and high activation degree
Equi-width setting
Outlook

Key Result

Lemma 4

For all invertible diagonal matrices $D_i\in\mathbb{R}^{d_i\times d_i}$ and permutation matrices $P_i\in\mathbb{Z}^{d_i\times d_i}$ ($i=1,\dots,L-1$), the parameter map $\Psi_{\mathbf{d},r}$ returns the same neural network under the replacement where $T$ denotes the matrix transpose.

Figures (1)

Figure 1: The blue curve is the graph of $r\mapsto \det(J(r))$ where $J(r)$ is the $6\times 6$ submatrix of \ref{['eq:six-eight-jacobian']}. The red curve is $(r,\log|y|)$ where $(r,y)$ is on the blue curve. The orange dashed lines are $r=1$ and $r=2$.

Theorems & Definitions (30)

Definition 1: Polynomial neural network
Remark 2
Remark 3: Special case
Lemma 4: KTB2019, Multi-homogeneity
Definition 5: Expected dimension
Example 6
Example 7
Remark 8
Definition 10
Theorem \ref{theorem:high-degree}
...and 20 more

Activation degree thresholds and expressiveness of polynomial neural networks

TL;DR

Abstract

Activation degree thresholds and expressiveness of polynomial neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (30)