Table of Contents
Fetching ...

Geometry of Polynomial Neural Networks

Kaie Kubjas, Jiayi Li, Maximilian Wiesmann

TL;DR

The paper addresses the geometry of polynomial neural networks with monomial activations by framing learnable functions as neuromanifolds and their Zariski closures as neurovarieties. It develops an algebro-geometric framework that quantifies expressivity via an expected dimension-based measure and learning dynamics via the learning degree, relating these to the generic Euclidean distance degree. Concrete results are provided for several architectures, including explicit descriptions via symmetric tensors, Grassmannians, and the Hilbert–Burch theorem, plus bounds and conjectures on dimension and optimization landscapes. Through both theory and experiments, the work illuminates how network structure constrains learnability and optimization, offering insights for architecture design and potential algorithmic techniques in polynomial neural networks.

Abstract

We study the expressivity and learning process for polynomial neural networks (PNNs) with monomial activation functions. The weights of the network parametrize the neuromanifold. In this paper, we study certain neuromanifolds using tools from algebraic geometry: we give explicit descriptions as semialgebraic sets and characterize their Zariski closures, called neurovarieties. We study their dimension and associate an algebraic degree, the learning degree, to the neurovariety. The dimension serves as a geometric measure for the expressivity of the network, the learning degree is a measure for the complexity of training the network and provides upper bounds on the number of learnable functions. These theoretical results are accompanied with experiments.

Geometry of Polynomial Neural Networks

TL;DR

The paper addresses the geometry of polynomial neural networks with monomial activations by framing learnable functions as neuromanifolds and their Zariski closures as neurovarieties. It develops an algebro-geometric framework that quantifies expressivity via an expected dimension-based measure and learning dynamics via the learning degree, relating these to the generic Euclidean distance degree. Concrete results are provided for several architectures, including explicit descriptions via symmetric tensors, Grassmannians, and the Hilbert–Burch theorem, plus bounds and conjectures on dimension and optimization landscapes. Through both theory and experiments, the work illuminates how network structure constrains learnability and optimization, offering insights for architecture design and potential algorithmic techniques in polynomial neural networks.

Abstract

We study the expressivity and learning process for polynomial neural networks (PNNs) with monomial activation functions. The weights of the network parametrize the neuromanifold. In this paper, we study certain neuromanifolds using tools from algebraic geometry: we give explicit descriptions as semialgebraic sets and characterize their Zariski closures, called neurovarieties. We study their dimension and associate an algebraic degree, the learning degree, to the neurovariety. The dimension serves as a geometric measure for the expressivity of the network, the learning degree is a measure for the complexity of training the network and provides upper bounds on the number of learnable functions. These theoretical results are accompanied with experiments.
Paper Structure (17 sections, 26 theorems, 101 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 26 theorems, 101 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Proposition 2.3

Let $\mathbf{d}=(d_0,\dots,d_i,\dots,d_L)$ and let $\mathbf{d}'=(d_0,\dots,d'_i,\dots,d_L)$ be a tuple which differs from $\mathbf{d}$ precisely in the $i^{\text{th}}$ entry for $0<i<L$ and assume $d'_i \geq d_i$. Then $\mathcal{M}_{\mathbf{d},r}\subseteq \mathcal{M}_{\mathbf{d}',r}$, in particular

Figures (3)

  • Figure 1: A neural network architecture with widths $\mathbf{d} = (2,3,3,1)$, input $\mathbf{x} = (x_1, x_2)^T$ and output $y_1$.
  • Figure 2: The neuromanifold $\mathcal{M}_{(2,1,1), 2}$ in $\text{Sym}_2(\mathbb{R}^2)\cong \mathbb{R}^3$.
  • Figure 3: Comparison of the most frequent quadratic polynomial learned by a PNN with $\mathbf{d}=(2,2,3)$ and $r=2$ in each coordinate with the ground truth. The green manifold is generated based on the ground truth, while the red manifold is the most frequent function learned by the network.

Theorems & Definitions (64)

  • Definition 2.1
  • Definition 2.2
  • Proposition 2.3
  • proof
  • Definition 2.4
  • Example 2.5
  • Example 2.6: kileel2019expressive, Example 3
  • Lemma 2.7: kileel2019expressive
  • Definition 2.8
  • Example 2.9
  • ...and 54 more