Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
TL;DR
This work introduces the Neural Hilbert Ladder (NHL), a hierarchical, width-unbounded framework that represents multi-layer neural networks as a sequence of nested RKHSs, yielding an infinite union of function spaces \mathcal{F}^{(L)}. It establishes static correspondences between L-layer NNs and NHLs, derives generalization guarantees via Rademacher complexity, and demonstrates depth-dependent capacity through depth separation under ReLU. In the mean-field limit, training dynamics become a non-Markovian functional gradient flow that evolves the NHL kernels, capturing feature learning beyond lazy training. The paper also provides two numerical experiments illustrating feature learning and kernel alignment, and it situates NHL within a broad landscape of kernel-based and mean-field analyses while outlining avenues for future work and relaxation of assumptions. Overall, NHL offers a rigorous, depth-aware, function-space view of deep networks with quantitative bounds on approximation and generalization, highlighting the role of depth in shaping representational capacity.
Abstract
To characterize the function space explored by neural networks (NNs) is an important aspect of learning theory. In this work, noticing that a multi-layer NN generates implicitly a hierarchy of reproducing kernel Hilbert spaces (RKHSs) - named a neural Hilbert ladder (NHL) - we define the function space as an infinite union of RKHSs, which generalizes the existing Barron space theory of two-layer NNs. We then establish several theoretical properties of the new space. First, we prove a correspondence between functions expressed by L-layer NNs and those belonging to L-level NHLs. Second, we prove generalization guarantees for learning an NHL with a controlled complexity measure. Third, we derive a non-Markovian dynamics of random fields that governs the evolution of the NHL which is induced by the training of multi-layer NNs in an infinite-width mean-field limit. Fourth, we show examples of depth separation in NHLs under the ReLU activation function. Finally, we perform numerical experiments to illustrate the feature learning aspect of NN training through the lens of NHLs.
