Spectral complexity of deep neural networks
Simmaco Di Lillo, Domenico Marinucci, Michele Salvi, Stefano Vigogna
TL;DR
This work introduces a spectral framework for understanding depth in neural networks by studying the angular power spectrum of the infinite-width limit treated as isotropic random fields on the sphere. It defines a depth-dependent spectral law $X_L$ with distribution $P(X_L=\ell)=D_{\ell;\kappa_L}$ and classifies architectures into low-disorder, sparse, or high-disorder regimes based on $\kappa'(1)$. The authors prove regime-dependent asymptotics for spectral moments and the limiting behavior of the random fields and their derivatives, showing that ReLU networks exhibit a sparse, low-frequency structure with high Sobolev energy, while tanh-type networks become increasingly oscillatory; they also introduce spectral effective support and dimension as practical complexity measures. Numerical experiments with Monte Carlo simulations and Healpix corroborate the theory, revealing sharp differences across activation functions and depth. The results offer a principled, depth-aware notion of complexity and point to future directions in geometry of random fields, finite-width-depth regimes, and extensions to convolutional architectures.
Abstract
It is well-known that randomly initialized, push-forward, fully-connected neural networks weakly converge to isotropic Gaussian processes, in the limit where the width of all layers goes to infinity. In this paper, we propose to use the angular power spectrum of the limiting field to characterize the complexity of the network architecture. In particular, we define sequences of random variables associated with the angular power spectrum, and provide a full characterization of the network complexity in terms of the asymptotic distribution of these sequences as the depth diverges. On this basis, we classify neural networks as low-disorder, sparse, or high-disorder; we show how this classification highlights a number of distinct features for standard activation functions, and in particular, sparsity properties of ReLU networks. Our theoretical results are also validated by numerical simulations.
