Table of Contents
Fetching ...

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

Mert Pilanci

TL;DR

A novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization is introduced, showing that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss.

Abstract

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via $\ell_1$ regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

TL;DR

A novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization is introduced, showing that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss.

Abstract

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.
Paper Structure (56 sections, 28 theorems, 176 equations, 40 figures)

This paper contains 56 sections, 28 theorems, 176 equations, 40 figures.

Key Result

Theorem 1

For all values of the regularization norm $p\in[1,\infty)$, the two-layer neural network problem in eq:two_layer_relu can be recast as the following $\ell_1$-regularized convex optimization problem where the entries of the matrix $K$ are given by and the number of neurons obey $m\ge \|z^*\|_0$. An optimal network can be constructed as where $z^*$ and $t^*$ are optimizers of eq:convex_one_dim_tw

Figures (40)

  • Figure 1: One-dimensional neuron
  • Figure 2: Two-dimensional neuron
  • Figure 3: Optimal breaklines in $\mathbb{R}^2$
  • Figure 5: Wedge product of two-dimensional (a) and three-dimensional (b) vectors in $\mathbb{G}^2$ and $\mathbb{G}^3$, and the triangular area defined in the convex program from Theorem \ref{['thm:main_two_dim_nobias']} (c)-(d).
  • Figure 6: Two-layer network without biases
  • ...and 35 more figures

Theorems & Definitions (62)

  • Theorem 1
  • Remark
  • Remark
  • Theorem 2
  • Remark
  • Definition 1: Near-optimal solutions
  • Definition 2: Range dispersion in $\mathbb{R}^2$
  • Theorem 3
  • Theorem 4
  • Remark
  • ...and 52 more