Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Li'ang Li; Yifei Duan; Guanghua Ji; Yongqiang Cai

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Li'ang Li, Yifei Duan, Guanghua Ji, Yongqiang Cai

TL;DR

This work establishes the exact minimum width for uniform universal approximation by leaky-ReLU networks for continuous vector-valued functions on compact domains, showing $w_{ m min}=\max(d_x,d_y)+\Delta(d_x,d_y)$ where $\Delta$ encodes the extra degrees necessary to realize continuous functions as embeddings into diffeomorphisms. The authors introduce a novel lift-flow-discretization framework that links uniform UAP to topology: lift the target as a high-dimensional diffeomorphism, realize its flow via neural ODEs, and discretize the flow with leaky-ReLU networks to achieve the approximation with width $w_{ m min}$. The analysis combines topological embedding results (Whitney-type) with constructive neural ODE approximations, and delineates how output dimension $d_y$ influences the needed width, including a detailed treatment of the regimes $d_x+1\le d_y\le 2d_x$ and $d_y>2d_x$. The work also discusses implications for ReLU networks and monotone activations, offering a topological perspective on minimal widths and highlighting open questions surrounding the precise value of $\Delta(d_x,d_y)$ in general.

Abstract

The study of universal approximation properties (UAP) for neural networks (NN) has a long history. When the network width is unlimited, only a single hidden layer is sufficient for UAP. In contrast, when the depth is unlimited, the width for UAP needs to be not less than the critical width $w^*_{\min}=\max(d_x,d_y)$, where $d_x$ and $d_y$ are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for $L^p$ functions on a compact domain ${K}$, \emph{i.e.,} the UAP for $L^p({K},\mathbb{R}^{d_y})$. This paper examines a uniform UAP for the function class $C({K},\mathbb{R}^{d_y})$ and gives the exact minimum width of the leaky-ReLU NN as $w_{\min}=\max(d_x,d_y)+Δ(d_x, d_y)$, where $Δ(d_x, d_y)$ is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

TL;DR

This work establishes the exact minimum width for uniform universal approximation by leaky-ReLU networks for continuous vector-valued functions on compact domains, showing

where

encodes the extra degrees necessary to realize continuous functions as embeddings into diffeomorphisms. The authors introduce a novel lift-flow-discretization framework that links uniform UAP to topology: lift the target as a high-dimensional diffeomorphism, realize its flow via neural ODEs, and discretize the flow with leaky-ReLU networks to achieve the approximation with width

. The analysis combines topological embedding results (Whitney-type) with constructive neural ODE approximations, and delineates how output dimension

influences the needed width, including a detailed treatment of the regimes

and

. The work also discusses implications for ReLU networks and monotone activations, offering a topological perspective on minimal widths and highlighting open questions surrounding the precise value of

in general.

Abstract

, where

and

are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for

functions on a compact domain

, \emph{i.e.,} the UAP for

. This paper examines a uniform UAP for the function class

and gives the exact minimum width of the leaky-ReLU NN as

, where

is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.

Paper Structure (25 sections, 10 theorems, 21 equations, 3 figures, 2 tables)

This paper contains 25 sections, 10 theorems, 21 equations, 3 figures, 2 tables.

Introduction
Contributions
Related work
Organization
Main results
Main theorem
Proof ideas
Lift-flow-discretization approach
Theory of the lift-flow-discretization approach
Proof of Lemma \ref{['th:w_upper_bound']}
Effect of the output dimension
The particular dimensions $d_x + 1 \le d_y \le 2d_x$
The case of $d_y > 2d_x$
Discussion
Notations
...and 10 more sections

Key Result

Theorem 2.2

Let $\mathcal{K} \subset \mathbb{R}^{d_x}$ be a compact set; then, for the continuous function class $C(\mathcal{K},\mathbb{R}^{d_y})$, the minimum width $w_{\min}$ of leaky-ReLU neural networks having $C$-UAP is $w_{\min}=\max(d_x,d_y)+\Delta(d_x, d_y)$, where $\Delta(d_x,d_y)$ is the auxiliary for

Figures (3)

Figure 1: Continuous function, neural network, and diffeomorphism. (a) An example function from $\mathcal{K} \subset \mathbb{R}^{d_x}$ to $\mathbb{R}^{d_y}$. (b) Feedforward neural networks with depth $L$ and width $N$. (c) Approximate $f^*$ using two linear transformations, $\alpha$ and $\beta$, and a diffeomorphism $\Phi$. (d) An intuitive construction of $\Phi$, where $\alpha$ and $\beta$ are a lift and a projection, respectively, i.e., $\alpha(x_1,x_2)=(x_1,x_2,0), \beta(z_1,z_2,z_3)=(z_1,z_2)$.
Figure 2: Sketch of the lift-flow-discretization approach. The target map ${\Phi}(x)$ is approximated by a flow map $\tilde{\phi}^\tau(x)$ of an ODE, which is further approximated by a flow map $\phi^\tau(x)$ of a neural ODE (\ref{['eq:tanh_ODE']}).
Figure 3: Example of $d_x=1$. Approximate the '4'-shape curve (a) in $\mathbb{R}^2$ by lifting it to the three-dimensional curve (b) in $\mathbb{R}^3$.

Theorems & Definitions (19)

Definition 2.1
Theorem 2.2
Lemma 2.3
Lemma 2.4
Lemma 3.1
Lemma 3.2
Corollary 3.3
Lemma 4.1
Lemma 4.2
Corollary 5.1
...and 9 more

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

TL;DR

Abstract

Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (19)