Wilsonian Renormalization of Neural Network Gaussian Processes
Jessica N. Howard, Ro Jefferson, Anindita Maiti, Zohar Ringel
TL;DR
This work introduces a Wilsonian Renormalization Group framework for Gaussian Process regression to study learnable versus unlearnable neural network features. By integrating out high-frequency kernel modes, it derives an RG flow in which the ridge parameter $\sigma^2$ renormalizes and, in non-Gaussian settings, becomes input-dependent, linking scale separation to generalization behavior. The approach yields tractable equations in the Gaussian case and connects to neural scaling laws through the kernel eigen-spectrum, with empirical validation on MNIST and CIFAR10 that matches observed MSE scaling trends. Extensions to non-Gaussian feature distributions reveal spatial reweighting of the loss and a functional RG flow for input-dependent regularization, providing a path toward universality classifications in deep learning and potential insights into feature learning in large neural networks.
Abstract
Separating relevant and irrelevant information is key to any modeling process or scientific inquiry. Theoretical physics offers a powerful tool for achieving this in the form of the renormalization group (RG). Here we demonstrate a practical approach to performing Wilsonian RG in the context of Gaussian Process (GP) Regression. We systematically integrate out the unlearnable modes of the GP kernel, thereby obtaining an RG flow of the GP in which the data sets the IR scale. In simple cases, this results in a universal flow of the ridge parameter, which becomes input-dependent in the richer scenario in which non-Gaussianities are included. In addition to being analytically tractable, this approach goes beyond structural analogies between RG and neural networks by providing a natural connection between RG flow and learnable vs. unlearnable modes. Studying such flows may improve our understanding of feature learning in deep neural networks, and enable us to identify potential universality classes in these models.
