Parameter Symmetry Potentially Unifies Deep Learning Theory
Liu Ziyin, Yizhou Xu, Tomaso Poggio, Isaac Chuang
TL;DR
This paper argues that a wide range of deep learning phenomena—across learning dynamics, model complexity, and neural representations—can be unified under the umbrella of parameter symmetry and its breaking/restoration. It introduces the idea of symmetry-to-symmetry dynamics, where training proceeds via transitions between symmetry groups, and shows how these transitions relate to effective model capacity and feature learning. The authors support three core hypotheses: dynamics tracking symmetry changes, adaptive capacity controlled by symmetry boundaries, and representation learning driven by layerwise symmetry, including neural collapse and universal representations. They also discuss mechanisms and controls (regularization, noise, data augmentation) that drive symmetry restoration or breaking, and outline practical ways to engineer symmetries to shape hierarchical learning. The work suggests a potentially fundamental principle—rooted in symmetry—from which broad AI phenomena might be derived, offering a principled design lens for future models and a path toward a universal theory of deep learning.
Abstract
The dynamics of learning in modern large AI systems is hierarchical, often characterized by abrupt, qualitative shifts akin to phase transitions observed in physical systems. While these phenomena hold promise for uncovering the mechanisms behind neural networks and language models, existing theories remain fragmented, addressing specific cases. In this position paper, we advocate for the crucial role of the research direction of parameter symmetries in unifying these fragmented theories. This position is founded on a centralizing hypothesis for this direction: parameter symmetry breaking and restoration are the unifying mechanisms underlying the hierarchical learning behavior of AI models. We synthesize prior observations and theories to argue that this direction of research could lead to a unified understanding of three distinct hierarchies in neural networks: learning dynamics, model complexity, and representation formation. By connecting these hierarchies, our position paper elevates symmetry -- a cornerstone of theoretical physics -- to become a potential fundamental principle in modern AI.
