Table of Contents
Fetching ...

Towards an Algebraic Framework For Approximating Functions Using Neural Network Polynomials

Shakil Rafi, Joshua Lee Padgett, Ukash Nakarmi

TL;DR

The paper develops an algebraic framework for neural network objects, treating feedforward nets as mathematical entities that can mimic real-valued operations. It constructs a hierarchy of neural networks, including $\mathsf{Pwr}_n^{q,\varepsilon}$, $\mathsf{Pnm}^{q,\varepsilon}_{n,C}$, $\mathsf{Xpn}^{q,\varepsilon}_n$, $\mathsf{Csn}^{q,\varepsilon}_n$, and $\mathsf{Sne}^{q,\varepsilon}_n$, and combines them with trapezoidal integration and interpolation tools to realize $1$-D function approximation and $\int_a^b e^x\,dx$ via $\mathsf{E}^{N,h,q,\varepsilon}_n$. The work provides explicit depth, width, and parameter bounds, demonstrating polynomial growth in network size relative to target accuracy measured in the $1$-norm, and paves a path for algebraic manipulation of neural networks as approximants to classical functions. It also introduces tunneling networks and network-diagrambased representations to manage unequal depths and sums, enabling scalable construction of neural-network-based approximants for polynomials, exponentials, and trigonometric functions. The framework aims to support applications in solving PDEs and related problems by providing tractable, algebraic neural approximants with provable error controls.

Abstract

We make the case for neural network objects and extend an already existing neural network calculus explained in detail in Chapter 2 on \cite{bigbook}. Our aim will be to show that, yes, indeed, it makes sense to talk about neural network polynomials, neural network exponentials, sine, and cosines in the sense that they do indeed approximate their real number counterparts subject to limitations on certain of their parameters, $q$, and $\varepsilon$. While doing this, we show that the parameter and depth growth are only polynomial on their desired accuracy (defined as a 1-norm difference over $\mathbb{R}$), thereby showing that this approach to approximating, where a neural network in some sense has the structural properties of the function it is approximating is not entire intractable.

Towards an Algebraic Framework For Approximating Functions Using Neural Network Polynomials

TL;DR

The paper develops an algebraic framework for neural network objects, treating feedforward nets as mathematical entities that can mimic real-valued operations. It constructs a hierarchy of neural networks, including , , , , and , and combines them with trapezoidal integration and interpolation tools to realize -D function approximation and via . The work provides explicit depth, width, and parameter bounds, demonstrating polynomial growth in network size relative to target accuracy measured in the -norm, and paves a path for algebraic manipulation of neural networks as approximants to classical functions. It also introduces tunneling networks and network-diagrambased representations to manage unequal depths and sums, enabling scalable construction of neural-network-based approximants for polynomials, exponentials, and trigonometric functions. The framework aims to support applications in solving PDEs and related problems by providing tractable, algebraic neural approximants with provable error controls.

Abstract

We make the case for neural network objects and extend an already existing neural network calculus explained in detail in Chapter 2 on \cite{bigbook}. Our aim will be to show that, yes, indeed, it makes sense to talk about neural network polynomials, neural network exponentials, sine, and cosines in the sense that they do indeed approximate their real number counterparts subject to limitations on certain of their parameters, , and . While doing this, we show that the parameter and depth growth are only polynomial on their desired accuracy (defined as a 1-norm difference over ), thereby showing that this approach to approximating, where a neural network in some sense has the structural properties of the function it is approximating is not entire intractable.
Paper Structure (26 sections, 31 theorems, 226 equations, 5 figures)

This paper contains 26 sections, 31 theorems, 226 equations, 5 figures.

Key Result

Lemma 2.7

Let $\nu_1, \nu_2 \in \mathop{\mathrm{\mathsf{NN}}}\nolimits$ and suppose $\mathop{\mathrm{\mathsf{O}}}\nolimits( \nu_1) = \mathop{\mathrm{\mathsf{I}}}\nolimits ( \nu_2)$. Then we have: $\mathop{\mathrm{\mathsf{D}}}\nolimits ( \nu_1 \bullet \nu_2 ) = \mathop{\mathrm{\mathsf{D}}}\nolimits( \nu_1) + \

Figures (5)

  • Figure 1: A representation of a typical $\mathop{\mathrm{\mathsf{Pwr}}}\nolimits^{q,\varepsilon}_n$ network.
  • Figure 2: Neural network diagram for an elementary neural network polynomial.
  • Figure 3: Diagram of $\mathsf{E}^{N,h,q,\varepsilon}_n$.
  • Figure 4: Neural network diagram for $\mathop{\mathrm{\mathsf{Mxm}}}\nolimits^5$.
  • Figure 5: Neural network diagramfor the $\mathsf{MC}_{x,y}^{N,d,l}$ network

Theorems & Definitions (106)

  • Definition 2.1
  • Remark 2.2
  • Definition 2.3: Instantiation with an activation function
  • Remark 2.4
  • Remark 2.5
  • Definition 2.6: Compositions of ANNs
  • Lemma 2.7
  • proof
  • Definition 2.8
  • Definition 2.9: The $\mathop{\mathrm{\mathsf{Cpy}}}\nolimits$ Network
  • ...and 96 more