On the Geometry and Optimization of Polynomial Convolutional Networks

Vahid Shahverdi; Giovanni Luca Marchetti; Kathlén Kohn

On the Geometry and Optimization of Polynomial Convolutional Networks

Vahid Shahverdi, Giovanni Luca Marchetti, Kathlén Kohn

TL;DR

This work analyzes CNNs with monomial activations through the lens of algebraic geometry, showing that after removing filter-scaling symmetries the parameterization is regular and generically one-to-one (birational) with finite fibers. It identifies the neuromanifold as closely related to Segre--Veronese varieties, deriving its dimension $ ext{dim}( ext{Neuromanifold}) = |oldsymbol{k}| - L + 1$ and degree $ ext{deg}( ext{Neuromanifold}) = (|oldsymbol{k}|-L)!\prod_{j=0}^{L-1} rac{r^{(L-j-1)(k_j-1)}}{(k_j-1)!}$ for $r>1$, and characterizing singularities as nodal points arising from subnetworks. The authors connect optimization to distance-minimization on the neuromanifold and compute the generic Euclidean distance degree, yielding a dataset-independent count of complex critical points for large generic datasets. These results illuminate the expressivity and learning dynamics of polynomial CNNs and suggest pathways for extending algebraic-geometric methods to broader network architectures and activation functions.

Abstract

We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map - typically referred to as neuromanifold. In particular, we compute the dimension and the degree of the neuromanifold, which measure the expressivity of the model, and describe its singularities. Moreover, for a generic large dataset, we derive an explicit formula that quantifies the number of critical points arising in the optimization of a regression loss.

On the Geometry and Optimization of Polynomial Convolutional Networks

TL;DR

and degree

for

, and characterizing singularities as nodal points arising from subnetworks. The authors connect optimization to distance-minimization on the neuromanifold and compute the generic Euclidean distance degree, yielding a dataset-independent count of complex critical points for large generic datasets. These results illuminate the expressivity and learning dynamics of polynomial CNNs and suggest pathways for extending algebraic-geometric methods to broader network architectures and activation functions.

Abstract

Paper Structure (24 sections, 16 theorems, 31 equations, 3 figures, 1 table)

This paper contains 24 sections, 16 theorems, 31 equations, 3 figures, 1 table.

INTRODUCTION
Summary of Results
RELATED WORK
Algebraic Geometry of Neuromanifolds.
Polynomial Activation Functions.
BACKGROUND
Polynomial Convolutional Networks
Segre--Veronese Varieties
Euclidean Distance Degree
CONVOLUTIONAL NEUROMANIFOLDS
Geometry
Optimization
CONCLUSIONS AND FUTURE WORK
ON THE DIFFERENTIAL OF THE PARAMETRIZATION
ADDITIONAL PROOFS
...and 9 more sections

Key Result

Theorem 3.1

The generic Euclidean Distance degree of the Segre--Veronese variety is: where $|\mathbf{p}| = p_1 + \cdots + p_{k}$.

Figures (3)

Figure 1: Illustration of a Segre--Veronese variety parametrizing CNNs.
Figure 2: Distance function from an anchor to a curve, visualized as a color gradient. The critical values are denoted by dotted lines.
Figure 3: Visualization of two-dimensional charts of neuromanifolds (over $\mathbb{R}$) corresponding to $(k_0,k_1)=(2,2)$ projected orthogonally to $\mathbb{R}^3$, with varying activation degree.

Theorems & Definitions (46)

Definition 1
Definition 2
Definition 3
Definition 4
Theorem 3.1: kozhasov2023minimal
Lemma 4.1
proof
Corollary 4.2
proof
Remark 4.1
...and 36 more

On the Geometry and Optimization of Polynomial Convolutional Networks

TL;DR

Abstract

On the Geometry and Optimization of Polynomial Convolutional Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (46)