Gaussian Universality in Neural Network Dynamics with Generalized Structured Input Distributions
Jaeyong Bae, Hawoong Jeong
TL;DR
This work broadens the hidden manifold framework by incorporating general Gaussian mixtures into neural network inputs and shows that, after standardization and under weak correlations, SGD dynamics converge to the Gaussian benchmark predicted by the Gaussian Equivalence Property. The authors establish rigorous results on asymptotic Gaussianity and covariance consistency for block-dependent mixtures and provide thorough numerical evidence, including a Berry–Esseen-based bound and data collapse across dimensions. They further explore the limits of universality under increased input correlation and real-world data, identifying key drivers of dynamic deviations and offering practical guidance for when Gaussian-based analyses remain informative. Overall, the study strengthens the theoretical foundation of deep learning dynamics by revealing a robust form of universality driven by low-order moments rather than exact distributional form.
Abstract
Analyzing neural network dynamics via stochastic gradient descent (SGD) is crucial to building theoretical foundations for deep learning. Previous work has analyzed structured inputs within the \textit{hidden manifold model}, often under the simplifying assumption of a Gaussian distribution. We extend this framework by modeling inputs as Gaussian mixtures to better represent complex, real-world data. Through empirical and theoretical investigation, we demonstrate that with proper standardization, the learning dynamics converges to the behavior seen in the simple Gaussian case. This finding exhibits a form of universality, where diverse structured distributions yield results consistent with Gaussian assumptions, thereby strengthening the theoretical understanding of deep learning models.
