From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

Mert Pilanci

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

Mert Pilanci

TL;DR

A novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization is introduced, showing that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss.

Abstract

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via $\ell_1$ regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

TL;DR

Abstract

regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.

Paper Structure (56 sections, 28 theorems, 176 equations, 40 figures)

This paper contains 56 sections, 28 theorems, 176 equations, 40 figures.

Introduction
Prior work
Summary of results
Notation
Setting and Methodology
Preliminaries
Geometric Algebra
Convex duality
Theoretical Results
One-dimensional data
Two-dimensional data
$\ell_1$ regularization - neurons without biases
$\ell_2$ regularization (weight decay) - neurons with biases
Arbitrary dimensions
$\ell_1$ regularization - neurons without biases
...and 41 more sections

Key Result

Theorem 1

For all values of the regularization norm $p\in[1,\infty)$, the two-layer neural network problem in eq:two_layer_relu can be recast as the following $\ell_1$-regularized convex optimization problem where the entries of the matrix $K$ are given by and the number of neurons obey $m\ge \|z^*\|_0$. An optimal network can be constructed as where $z^*$ and $t^*$ are optimizers of eq:convex_one_dim_tw

Figures (40)

Figure 1: One-dimensional neuron
Figure 2: Two-dimensional neuron
Figure 3: Optimal breaklines in $\mathbb{R}^2$
Figure 5: Wedge product of two-dimensional (a) and three-dimensional (b) vectors in $\mathbb{G}^2$ and $\mathbb{G}^3$, and the triangular area defined in the convex program from Theorem \ref{['thm:main_two_dim_nobias']} (c)-(d).
Figure 6: Two-layer network without biases
...and 35 more figures

Theorems & Definitions (62)

Theorem 1
Remark
Remark
Theorem 2
Remark
Definition 1: Near-optimal solutions
Definition 2: Range dispersion in $\mathbb{R}^2$
Theorem 3
Theorem 4
Remark
...and 52 more

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

TL;DR

Abstract

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (40)

Theorems & Definitions (62)