Applications of Statistical Field Theory in Deep Learning
Zohar Ringel, Noa Rubin, Edo Mor, Moritz Helias, Inbar Seroussi
TL;DR
This work surveys the application of statistical field theory to deep learning, arguing that a physics-inspired framework—built on path integrals, replicas, and large-width limits—can illuminate generalization, bias, and feature learning. It develops three analytic strands: (i) infinite-width Gaussian-process/NTK mappings that connect DNNs to GPR and kernel methods, (ii) field-theoretic treatments of data-averaged GPR via replicas and RG to capture dataset effects and scaling laws, and (iii) dynamical field theories (MSRDJ) for non-linear, finite-width networks to bridge equilibrium GP/NTK descriptions with time-dependent learning. These approaches yield concrete insights such as spectral bias, effective ridge renormalization, and kernel adaptation mechanisms that can explain when and how deep networks outperform their lazy or linear counterparts. The synthesis points to practical implications like hyperparameter transfer, scaling laws, and principled regularization via kernel dynamics, while outlining avenues for extending field-theoretic analyses to richer architectures and dynamics. Overall, the paper sketches a path toward a unifying theory of deep learning grounded in statistical physics, with tangible predictions for generalization and learning dynamics.
Abstract
Deep learning algorithms have made incredible strides in the past decade, yet due to their complexity, the science of deep learning remains in its early stages. Being an experimentally driven field, it is natural to seek a theory of deep learning within the physics paradigm. As deep learning is largely about learning functions and distributions over functions, statistical field theory, a rich and versatile toolbox for tackling complex distributions over functions (fields) is an obvious choice of formalism. Research efforts carried out in the past few years have demonstrated the ability of field theory to provide useful insights on generalization, implicit bias, and feature learning effects. Here we provide a pedagogical review of this emerging line of research.
