A Survey on Universal Approximation Theorems

Midhun T Augustine

A Survey on Universal Approximation Theorems

Midhun T Augustine

TL;DR

This survey analyzes universal approximation theorems (UATs) for feedforward neural networks, tracing results from early function-approximation theory (Taylor, Fourier, Weierstrass, Kolmogorov–Arnold) to modern NN-based density results in spaces like $\mathcal{C}(\mathbb{X})$ and $L^p$. It differentiates between arbitrary width (bounded depth) and arbitrary depth (bounded width) frameworks, highlighting that nonpolynomial activations render shallow networks universal and establishing width thresholds (e.g., $W\le n+4$ and $W^* = \max\{n+1, m\}$) for universal approximation. The paper also documents historical milestones (Cun, Lapedes–Farber, Cybenko, Hornik, Leshno) and clarifies how depth contributes to expressivity beyond width limitations, with implications for architecture design. By connecting classical approximation theory to NN expressivity and outlining extensions to other architectures, the work provides a consolidated reference for researchers assessing the theoretical capabilities of neural networks in real-world settings.

Abstract

This paper discusses various theorems on the approximation capabilities of neural networks (NNs), which are known as universal approximation theorems (UATs). The paper gives a systematic overview of UATs starting from the preliminary results on function approximation, such as Taylor's theorem, Fourier's theorem, Weierstrass approximation theorem, Kolmogorov - Arnold representation theorem, etc. Theoretical and numerical aspects of UATs are covered from both arbitrary width and depth.

A Survey on Universal Approximation Theorems

TL;DR

and

. It differentiates between arbitrary width (bounded depth) and arbitrary depth (bounded width) frameworks, highlighting that nonpolynomial activations render shallow networks universal and establishing width thresholds (e.g.,

and

) for universal approximation. The paper also documents historical milestones (Cun, Lapedes–Farber, Cybenko, Hornik, Leshno) and clarifies how depth contributes to expressivity beyond width limitations, with implications for architecture design. By connecting classical approximation theory to NN expressivity and outlining extensions to other architectures, the work provides a consolidated reference for researchers assessing the theoretical capabilities of neural networks in real-world settings.

Abstract

Paper Structure (7 sections, 10 theorems, 17 equations, 6 figures)

This paper contains 7 sections, 10 theorems, 17 equations, 6 figures.

Introduction
Neural Network (NN)
Universal Approximation Theorems (UATs)
UATs: Predecessors
UATs: Arbitrary width case
UATs: Arbitrary depth case
Conclusions and Further reading

Key Result

Theorem 1

Any continuous function $f(x):\mathbb{R} \rightarrow \mathbb{R}$ that is $k-$ times differentiable at $a\in \mathbb{R}$ can be represented as a sum of polynomials: where $c_{i}=\frac{f^{i}(a)}{i!}= \frac{1}{i!} \frac{d^{i}}{dx^i}f(x)|_{x=a}$ and $R_{k}(x)=o(|x-a|^{k})$ is the residual term.

Figures (6)

Figure 1: (a) Neural Network (b) Neuron.
Figure 2: Graph of activation functions: (a) ReLU (b) Step (c) Logistic (d) Tanh.
Figure 3: Illustrating NN.
Figure 4: (a) NN with arbitrary width (b) NN with arbitrary depth.
Figure 5: Output of NNs with one hidden layer and ReLU activation function.
...and 1 more figures

Theorems & Definitions (10)

Theorem 1: Taylor, 1715
Theorem 2: Fourier, 1807
Theorem 3: Weierstrass, 1885
Theorem 4: Kolmogorov and Arnold, 1959
Theorem 5: Pascanu et al., 2013
Theorem 6: Funahashi, Hornick et al., and Cybenko, 1989
Theorem 7: Leshno et al., 1993
Theorem 8: Lu et al., 2017
Theorem 9: Lu et al., 2017
Theorem 10: Park et al., 2021

A Survey on Universal Approximation Theorems

TL;DR

Abstract

A Survey on Universal Approximation Theorems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (10)