Theory-to-Practice Gap for Neural Networks and Neural Operators
Philipp Grohs, Samuel Lanthaler, Margaret Trautner
TL;DR
This work establishes a unified theory-to-practice gap framework for learning with ReLU neural networks and neural-operator architectures. In finite dimensions, it sharpens bounds on the best possible sampling-rate $\beta_\ast$ in general $L^p$-norms, showing $\beta_\ast \le \frac{1}{p} + \frac{1}{d}\cdot\frac{\alpha}{\alpha+\lfloor \bm{\ell}^*/2\rfloor}$, revealing a dimension-dependent bottleneck that persists even for large parametric rates. Extending to infinite dimensions, the paper proves that for Deep Operator Networks and integral-kernel neural operators the optimal convergence rate in Bochner $L^p$-norm is bounded by $\beta_\ast \le \frac{1}{p}$, while uniform convergence over infinite-dimensional input sets can achieve no algebraic rate ($\beta_\ast = 0$). The results apply to Fourier Neural Operators and related kernel-based operators, providing a cohesive $L^p$-theory bridging finite and infinite-dimensional learning and clarifying fundamental limits of these data-driven methods. These insights quantify the practical consequences of information constraints on learning mappings and operators in high dimensions.
Abstract
This work studies the sampling complexity of learning with ReLU neural networks and neural operators. For mappings belonging to relevant approximation spaces, we derive upper bounds on the best-possible convergence rate of any learning algorithm, with respect to the number of samples. In the finite-dimensional case, these bounds imply a gap between the parametric and sampling complexities of learning, known as the \emph{theory-to-practice gap}. In this work, a unified treatment of the theory-to-practice gap is achieved in a general $L^p$-setting, while at the same time improving available bounds in the literature. Furthermore, based on these results the theory-to-practice gap is extended to the infinite-dimensional setting of operator learning. Our results apply to Deep Operator Networks and integral kernel-based neural operators, including the Fourier neural operator. We show that the best-possible convergence rate in a Bochner $L^p$-norm is bounded by Monte-Carlo rates of order $1/p$.
