Large-width functional asymptotics for deep Gaussian neural networks
Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti
TL;DR
This work develops a function-space framework to study infinitely wide deep Gaussian neural networks with Gaussian weights and biases. By treating networks as stochastic processes on ${\mathbb R}^I$ and employing Levy's theorem, Daniell–Kolmogorov extension, and Kolmogorov–Chentsov arguments, the authors show that the networks converge to continuous Gaussian processes in the large-width limit, with sample-paths locally $\gamma$-Hölder for any $0<\gamma<1$ when the activation is Lipschitz. They derive explicit recursion formulas for the limiting covariance$\Sigma(l)$, establishing that fixed-unit limits are Gaussian with $\Sigma(1)_{ij}=\sigma_b^2+\sigma_\omega^2\langle x^{(i)},x^{(j)}\rangle$ and $\Sigma(l)_{ij}=\sigma_b^2+\sigma_\omega^2\int\phi(u)\phi(v)\, q^{(l-1)}(du,dv)$, $q^{(l-1)}=N_k(0,\Sigma(l-1))$. The vector of all units converges to a product Gaussian across units, enabling a rigorous weak convergence result in a stronger function-space metric. Overall, the paper strengthens the theoretical connection between infinitely wide deep networks and Gaussian processes and lays groundwork for GP-based analysis in broader neural-network architectures.
Abstract
In this paper, we consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions. Extending previous results (Matthews et al., 2018a;b; Yang, 2019) we adopt a function-space perspective, i.e. we look at neural networks as infinite-dimensional random elements on the input space $\mathbb{R}^I$. Under suitable assumptions on the activation function we show that: i) a network defines a continuous Gaussian process on the input space $\mathbb{R}^I$; ii) a network with re-scaled weights converges weakly to a continuous Gaussian process in the large-width limit; iii) the limiting Gaussian process has almost surely locally $γ$-Hölder continuous paths, for $0 < γ<1$. Our results contribute to recent theoretical studies on the interplay between infinitely wide deep neural networks and Gaussian processes by establishing weak convergence in function-space with respect to a stronger metric.
