TASI Lectures on Physics for Machine Learning
Jim Halverson
TL;DR
This work surveys a field-theoretic view of neural networks organized around expressivity, statistics, and dynamics. It connects classic results such as the Universal Approximation Theorem and neural network Gaussian process limits to modern insights from neural tangent kernels and feature learning, culminating in a neural network field theory perspective that yields a potential bridge to quantum and statistical field theories. Key contributions include a clean derivation of the NNGP limit, analysis of non Gaussian corrections, a principled scaling framework for feature learning via maximal update parameterization, and a mapping between neural networks and interacting field theories including phi4. The practical significance lies in providing analytic control over learning dynamics, guiding principled architecture design, and offering a framework to import field-theoretic techniques into ML theory and physics alike.
Abstract
These notes are based on lectures I gave at TASI 2024 on Physics for Machine Learning. The focus is on neural network theory, organized according to network expressivity, statistics, and dynamics. I present classic results such as the universal approximation theorem and neural network / Gaussian process correspondence, and also more recent results such as the neural tangent kernel, feature learning with the maximal update parameterization, and Kolmogorov-Arnold networks. The exposition on neural network theory emphasizes a field theoretic perspective familiar to theoretical physicists. I elaborate on connections between the two, including a neural network approach to field theory.
