A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks
William Merrill, Nikolaos Tsilivis, Aman Shukla
TL;DR
The paper investigates grokking on a sparse parity task and shows that a sparse subnetwork emerges and dominates predictions after the grokking transition, while a dense subnetwork governs behavior beforehand. This transition is associated with targeted norm growth in a subset of neurons, leading to effective sparsification and a competitive dynamic between subnetworks. The findings connect sparsity and norm dynamics to generalization, suggesting a mechanism that could underlie emergent behaviors in larger models. The work has implications for understanding how targeted weight growth and subnetwork specialization enable robust generalization in overparameterized networks.
Abstract
Grokking is a phenomenon where a model trained on an algorithmic task first overfits but, then, after a large amount of additional training, undergoes a phase transition to generalize perfectly. We empirically study the internal structure of networks undergoing grokking on the sparse parity task, and find that the grokking phase transition corresponds to the emergence of a sparse subnetwork that dominates model predictions. On an optimization level, we find that this subnetwork arises when a small subset of neurons undergoes rapid norm growth, whereas the other neurons in the network decay slowly in norm. Thus, we suggest that the grokking phase transition can be understood to emerge from competition of two largely distinct subnetworks: a dense one that dominates before the transition and generalizes poorly, and a sparse one that dominates afterwards.
