On the growth of the parameters of approximating ReLU neural networks
Erion Morina, Martin Holler
TL;DR
The paper investigates how the number of parameters required to realize neural networks grows as they approximate smooth functions, rather than how approximation error scales with architecture. It proves that deep ReLU networks achieving near-optimal approximation rates exhibit polynomial, not exponential, growth in parameters, providing explicit constructions and bounds; a contrasting negative result shows shallow networks can incur exponential parameter growth for certain activations. The work situates its results relative to the literature, highlighting that ReQU-based networks can yield uniformly bounded parameters, while ReLU-based growth is favorable in high dimensions. The findings have implications for error analysis and training stability, and they underscore the practical advantage of deeper architectures in controlling parameter magnitudes during approximation.
Abstract
This work focuses on the analysis of fully connected feed forward ReLU neural networks as they approximate a given, smooth function. In contrast to conventionally studied universal approximation properties under increasing architectures, e.g., in terms of width or depth of the networks, we are concerned with the asymptotic growth of the parameters of approximating networks. Such results are of interest, e.g., for error analysis or consistency results for neural network training. The main result of our work is that, for a ReLU architecture with state of the art approximation error, the realizing parameters grow at most polynomially. The obtained rate with respect to a normalized network size is compared to existing results and is shown to be superior in most cases, in particular for high dimensional input.
