Emergence of Computational Structure in a Neural Network Physics Simulator
Rohan Hitchcock, Gary W. Delaney, Jonathan H. Manton, Richard Scalzo, Jingge Zhu
TL;DR
The paper investigates how interpretable computational structures emerge in a transformer-based neural network trained to simulate a particle system under gravity. By introducing collision-detection as a measurable head behavior and analyzing the attention-distance correlation alongside the local learning coefficient LLC, the authors show that collision-detection heads arise in conjunction with degenerate loss-landscape geometry and power-law dynamics, described via an effective potential. They draw parallels to second-order phase transitions to interpret these dynamics and discuss implications for convergence times and early training interventions. While offering a mechanistic view of emergent computation in this physics simulator, the study remains limited to a single model and calls for broader validation across architectures and tasks.
Abstract
Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate "effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.
