On bounds for norms of reparameterized ReLU artificial neural network parameters: sums of fractional powers of the Lipschitz norm control the network parameter vector
Arnulf Jentzen, Timo Kröger
TL;DR
This work establishes a sharp connection between network parameter norms and the Lipschitz norm of shallow ReLU realizations by proving that every parameter vector can be reparameterized to share its realization with a bound on the parameter norm in terms of the Lipschitz norm, specifically with exponents $1/2$ and $1$. The authors develop geometric tools (tessellations of convex polytopes and affine-hyperplane analysis) to construct such reparameterizations and prove the main bound, along with a two-sided equivalence to the Lipschitz norm. They also prove lower bounds showing these exponents are tight, and they demonstrate that similar bounds do not extend to Hölder or Sobolev-Slobodeckij norms, establishing the sharpness of the Lipschitz-norm approach. Collectively, these results illuminate fundamental limits on how parameter-norm-based bounds can control reparameterized networks, with potential implications for optimization and parameter identifiability in shallow architectures.
Abstract
It is an elementary fact in the scientific literature that the Lipschitz norm of the realization function of a feedforward fully-connected rectified linear unit (ReLU) artificial neural network (ANN) can, up to a multiplicative constant, be bounded from above by sums of powers of the norm of the ANN parameter vector. Roughly speaking, in this work we reveal in the case of shallow ANNs that the converse inequality is also true. More formally, we prove that the norm of the equivalence class of ANN parameter vectors with the same realization function is, up to a multiplicative constant, bounded from above by the sum of powers of the Lipschitz norm of the ANN realization function (with the exponents $ 1/2 $ and $ 1 $). Moreover, we prove that this upper bound only holds when employing the Lipschitz norm but does neither hold for Hölder norms nor for Sobolev-Slobodeckij norms. Furthermore, we prove that this upper bound only holds for sums of powers of the Lipschitz norm with the exponents $ 1/2 $ and $ 1 $ but does not hold for the Lipschitz norm alone.
