Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models
Shahar Haim, Daniel C McNamee
TL;DR
This paper addresses how decoder-only LLMs transform context into predictions by proposing a mechanistic-geometry framework that unifies geometric analysis with causal interventions. It demonstrates a depth-wise transition from context-processing to prediction-forming and identifies a late-layer angular code that parametrizes prediction distribution similarity, while representation norms encode context-specific information that does not determine predictions. The key contributions are the discovery of two distinct computational phases, the evidence of a two-component geometric coding in the prediction-centric phase, and the demonstration that angular geometry in late layers causally controls token identity, supported by both input- and output-centric interventions. The findings have implications for model interpretation and control, suggesting that targeting causally operative angular structure in late layers could enable more reliable and principled manipulation of predictions, with architectural considerations like normalization potentially promoting angular coding.
Abstract
We show that decoder-only large language models exhibit a depth-wise transition from context-processing to prediction-forming phases of computation accompanied by a reorganization of representational geometry. Using a unified framework combining geometric analysis with mechanistic intervention, we demonstrate that late-layer representations implement a structured geometric code that enables selective causal control over token prediction. Specifically, angular organization of the representation geometry parametrizes prediction distributional similarity, while representation norms encode context-specific information that does not determine prediction. Together, these results provide a mechanistic-geometric account of the dynamics of transforming context into predictions in LLMs.
