Table of Contents
Fetching ...

Universal approximation with complex-valued deep narrow neural networks

Paul Geuchen, Thomas Jahn, Hannes Matt

TL;DR

It is shown that deep narrow complex-valued networks are universal if and only if their activation function is neither holomorphic, nor antiholomorphic, nor R and that a width of 2n+2m+5 is always sufficient and that in general a width of max2n,2m is necessary.

Abstract

We study the universality of complex-valued neural networks with bounded widths and arbitrary depths. Under mild assumptions, we give a full description of those activation functions $\varrho:\mathbb{C}\to \mathbb{C}$ that have the property that their associated networks are universal, i.e., are capable of approximating continuous functions to arbitrary accuracy on compact domains. Precisely, we show that deep narrow complex-valued networks are universal if and only if their activation function is neither holomorphic, nor antiholomorphic, nor $\mathbb{R}$-affine. This is a much larger class of functions than in the dual setting of arbitrary width and fixed depth. Unlike in the real case, the sufficient width differs significantly depending on the considered activation function. We show that a width of $2n+2m+5$ is always sufficient and that in general a width of $max\{2n,2m\}$ is necessary. We prove, however, that a width of $n+m+3$ suffices for a rich subclass of the admissible activation functions. Here, $n$ and $m$ denote the input and output dimensions of the considered networks. Moreover, for the case of smooth and non-polyharmonic activation functions, we provide a quantitative approximation bound in terms of the depth of the considered networks.

Universal approximation with complex-valued deep narrow neural networks

TL;DR

It is shown that deep narrow complex-valued networks are universal if and only if their activation function is neither holomorphic, nor antiholomorphic, nor R and that a width of 2n+2m+5 is always sufficient and that in general a width of max2n,2m is necessary.

Abstract

We study the universality of complex-valued neural networks with bounded widths and arbitrary depths. Under mild assumptions, we give a full description of those activation functions that have the property that their associated networks are universal, i.e., are capable of approximating continuous functions to arbitrary accuracy on compact domains. Precisely, we show that deep narrow complex-valued networks are universal if and only if their activation function is neither holomorphic, nor antiholomorphic, nor -affine. This is a much larger class of functions than in the dual setting of arbitrary width and fixed depth. Unlike in the real case, the sufficient width differs significantly depending on the considered activation function. We show that a width of is always sufficient and that in general a width of is necessary. We prove, however, that a width of suffices for a rich subclass of the admissible activation functions. Here, and denote the input and output dimensions of the considered networks. Moreover, for the case of smooth and non-polyharmonic activation functions, we provide a quantitative approximation bound in terms of the depth of the considered networks.
Paper Structure (18 sections, 29 theorems, 160 equations, 5 figures)

This paper contains 18 sections, 29 theorems, 160 equations, 5 figures.

Key Result

Theorem 1.1

Let $n,m\in\mathbb{N}$, and $\varrho:\mathbb{C}\to\mathbb{C}$ be a continuous function which at some point is real differentiable with non-vanishing derivative. Then $\mathcal{NN}^\varrho_{n,m,2n+2m+5}$ is universal if and only if $\varrho$ is neither holomorphic, nor antiholomorphic, nor $\mathbb{R

Figures (5)

  • Figure 1: Our results in a nutshell.
  • Figure 2: Illustration of the neural network building blocks from \ref{['prop: main']}. Neurons in the input and output layers are depicted in filled dots at the top and bottom, respectively. Applications of the activation function $\varrho$ are shown as circles.
  • Figure 3: Illustration of the neural network building block from \ref{['prop: approx']}. Neurons in the input and output layers are depicted in filled dots at the top and bottom, respectively. Applications of the activation function $\varrho$ are shown as circles.
  • Figure 4: Illustration of the neural network building block from \ref{['prop: multiplication_approx']}. Neurons in the input and output layers are depicted in filled dots at the top and bottom, respectively. Applications of the activation function $\varrho$ are shown as circles. From the input values $z$ and $w$, three or two linear combinations are computed. Then the building block from \ref{['fig:approx']} is inserted to approximate $z\mapsto z^2$, $z\mapsto \overline{z}^2$, or $z\mapsto z\overline{z}$. The results are again combined linearly.
  • Figure 5: Illustration of the register model from \ref{['register_model']}. Neurons where the complex identity is used as activation function are visualized as squares, whereas neurons using $\varrho$ as activation function are visualized using circles. The in-register neurons (squares on the left) store the input values and pass them to the computation neurons (middle circles). The result of the computations are added up and stored in the out-register neurons (squares on the right). The dashed box highlights one of the blocks that are later replaced using approximations of the complex identity.

Theorems & Definitions (61)

  • Theorem 1.1
  • Lemma 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Proposition 3.1
  • proof
  • Proposition 3.2
  • proof
  • Proposition 3.3
  • ...and 51 more