Table of Contents
Fetching ...

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Li'ang Li, Yifei Duan, Guanghua Ji, Yongqiang Cai

TL;DR

This work establishes the exact minimum width for uniform universal approximation by leaky-ReLU networks for continuous vector-valued functions on compact domains, showing $w_{ m min}=\max(d_x,d_y)+\Delta(d_x,d_y)$ where $\Delta$ encodes the extra degrees necessary to realize continuous functions as embeddings into diffeomorphisms. The authors introduce a novel lift-flow-discretization framework that links uniform UAP to topology: lift the target as a high-dimensional diffeomorphism, realize its flow via neural ODEs, and discretize the flow with leaky-ReLU networks to achieve the approximation with width $w_{ m min}$. The analysis combines topological embedding results (Whitney-type) with constructive neural ODE approximations, and delineates how output dimension $d_y$ influences the needed width, including a detailed treatment of the regimes $d_x+1\le d_y\le 2d_x$ and $d_y>2d_x$. The work also discusses implications for ReLU networks and monotone activations, offering a topological perspective on minimal widths and highlighting open questions surrounding the precise value of $\Delta(d_x,d_y)$ in general.

Abstract

The study of universal approximation properties (UAP) for neural networks (NN) has a long history. When the network width is unlimited, only a single hidden layer is sufficient for UAP. In contrast, when the depth is unlimited, the width for UAP needs to be not less than the critical width $w^*_{\min}=\max(d_x,d_y)$, where $d_x$ and $d_y$ are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for $L^p$ functions on a compact domain ${K}$, \emph{i.e.,} the UAP for $L^p({K},\mathbb{R}^{d_y})$. This paper examines a uniform UAP for the function class $C({K},\mathbb{R}^{d_y})$ and gives the exact minimum width of the leaky-ReLU NN as $w_{\min}=\max(d_x,d_y)+Δ(d_x, d_y)$, where $Δ(d_x, d_y)$ is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

TL;DR

This work establishes the exact minimum width for uniform universal approximation by leaky-ReLU networks for continuous vector-valued functions on compact domains, showing where encodes the extra degrees necessary to realize continuous functions as embeddings into diffeomorphisms. The authors introduce a novel lift-flow-discretization framework that links uniform UAP to topology: lift the target as a high-dimensional diffeomorphism, realize its flow via neural ODEs, and discretize the flow with leaky-ReLU networks to achieve the approximation with width . The analysis combines topological embedding results (Whitney-type) with constructive neural ODE approximations, and delineates how output dimension influences the needed width, including a detailed treatment of the regimes and . The work also discusses implications for ReLU networks and monotone activations, offering a topological perspective on minimal widths and highlighting open questions surrounding the precise value of in general.

Abstract

The study of universal approximation properties (UAP) for neural networks (NN) has a long history. When the network width is unlimited, only a single hidden layer is sufficient for UAP. In contrast, when the depth is unlimited, the width for UAP needs to be not less than the critical width , where and are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for functions on a compact domain , \emph{i.e.,} the UAP for . This paper examines a uniform UAP for the function class and gives the exact minimum width of the leaky-ReLU NN as , where is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.
Paper Structure (25 sections, 10 theorems, 21 equations, 3 figures, 2 tables)

This paper contains 25 sections, 10 theorems, 21 equations, 3 figures, 2 tables.

Key Result

Theorem 2.2

Let $\mathcal{K} \subset \mathbb{R}^{d_x}$ be a compact set; then, for the continuous function class $C(\mathcal{K},\mathbb{R}^{d_y})$, the minimum width $w_{\min}$ of leaky-ReLU neural networks having $C$-UAP is $w_{\min}=\max(d_x,d_y)+\Delta(d_x, d_y)$, where $\Delta(d_x,d_y)$ is the auxiliary for

Figures (3)

  • Figure 1: Continuous function, neural network, and diffeomorphism. (a) An example function from $\mathcal{K} \subset \mathbb{R}^{d_x}$ to $\mathbb{R}^{d_y}$. (b) Feedforward neural networks with depth $L$ and width $N$. (c) Approximate $f^*$ using two linear transformations, $\alpha$ and $\beta$, and a diffeomorphism $\Phi$. (d) An intuitive construction of $\Phi$, where $\alpha$ and $\beta$ are a lift and a projection, respectively, i.e., $\alpha(x_1,x_2)=(x_1,x_2,0), \beta(z_1,z_2,z_3)=(z_1,z_2)$.
  • Figure 2: Sketch of the lift-flow-discretization approach. The target map ${\Phi}(x)$ is approximated by a flow map $\tilde{\phi}^\tau(x)$ of an ODE, which is further approximated by a flow map $\phi^\tau(x)$ of a neural ODE (\ref{['eq:tanh_ODE']}).
  • Figure 3: Example of $d_x=1$. Approximate the '4'-shape curve (a) in $\mathbb{R}^2$ by lifting it to the three-dimensional curve (b) in $\mathbb{R}^3$.

Theorems & Definitions (19)

  • Definition 2.1
  • Theorem 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Lemma 3.1
  • Lemma 3.2
  • Corollary 3.3
  • Lemma 4.1
  • Lemma 4.2
  • Corollary 5.1
  • ...and 9 more