Minimum Width of Leaky-ReLU Neural Networks for Uniform Universal Approximation
Li'ang Li, Yifei Duan, Guanghua Ji, Yongqiang Cai
TL;DR
This work establishes the exact minimum width for uniform universal approximation by leaky-ReLU networks for continuous vector-valued functions on compact domains, showing $w_{ m min}=\max(d_x,d_y)+\Delta(d_x,d_y)$ where $\Delta$ encodes the extra degrees necessary to realize continuous functions as embeddings into diffeomorphisms. The authors introduce a novel lift-flow-discretization framework that links uniform UAP to topology: lift the target as a high-dimensional diffeomorphism, realize its flow via neural ODEs, and discretize the flow with leaky-ReLU networks to achieve the approximation with width $w_{ m min}$. The analysis combines topological embedding results (Whitney-type) with constructive neural ODE approximations, and delineates how output dimension $d_y$ influences the needed width, including a detailed treatment of the regimes $d_x+1\le d_y\le 2d_x$ and $d_y>2d_x$. The work also discusses implications for ReLU networks and monotone activations, offering a topological perspective on minimal widths and highlighting open questions surrounding the precise value of $\Delta(d_x,d_y)$ in general.
Abstract
The study of universal approximation properties (UAP) for neural networks (NN) has a long history. When the network width is unlimited, only a single hidden layer is sufficient for UAP. In contrast, when the depth is unlimited, the width for UAP needs to be not less than the critical width $w^*_{\min}=\max(d_x,d_y)$, where $d_x$ and $d_y$ are the dimensions of the input and output, respectively. Recently, \cite{cai2022achieve} shows that a leaky-ReLU NN with this critical width can achieve UAP for $L^p$ functions on a compact domain ${K}$, \emph{i.e.,} the UAP for $L^p({K},\mathbb{R}^{d_y})$. This paper examines a uniform UAP for the function class $C({K},\mathbb{R}^{d_y})$ and gives the exact minimum width of the leaky-ReLU NN as $w_{\min}=\max(d_x,d_y)+Δ(d_x, d_y)$, where $Δ(d_x, d_y)$ is the additional dimensions for approximating continuous functions with diffeomorphisms via embedding. To obtain this result, we propose a novel lift-flow-discretization approach that shows that the uniform UAP has a deep connection with topological theory.
