Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam, Seok Hyeong Lee, Clementine C J Domine, Yeachan Park, Charles London, Wonyl Choi, Niclas Goring, Seungjai Lee
TL;DR
This work argues that solving layerwise linear models under the dynamical feedback principle can illuminate core neural dynamics such as neural collapse, emergence, lazy/rich regimes, and grokking. By focusing on solvable, multilayer linear architectures, the authors derive sigmoidal and stage-like learning, connect these dynamics to empirical DNN phenomena, and show how layer imbalance and weight-to-target ratios control learning regimes. The contributions include formalizing the dynamical feedback principle, analyzing toy models, and interpreting phenomena like NC and grokking within a unified layerwise framework. The approach offers a principled, analytic lens to understand DNN behavior, with potential to guide initialization, architecture design, and training strategies toward more interpretable and generalizable models.
Abstract
In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other's evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.
