Table of Contents
Fetching ...

Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models

Shahar Haim, Daniel C McNamee

TL;DR

This paper addresses how decoder-only LLMs transform context into predictions by proposing a mechanistic-geometry framework that unifies geometric analysis with causal interventions. It demonstrates a depth-wise transition from context-processing to prediction-forming and identifies a late-layer angular code that parametrizes prediction distribution similarity, while representation norms encode context-specific information that does not determine predictions. The key contributions are the discovery of two distinct computational phases, the evidence of a two-component geometric coding in the prediction-centric phase, and the demonstration that angular geometry in late layers causally controls token identity, supported by both input- and output-centric interventions. The findings have implications for model interpretation and control, suggesting that targeting causally operative angular structure in late layers could enable more reliable and principled manipulation of predictions, with architectural considerations like normalization potentially promoting angular coding.

Abstract

We show that decoder-only large language models exhibit a depth-wise transition from context-processing to prediction-forming phases of computation accompanied by a reorganization of representational geometry. Using a unified framework combining geometric analysis with mechanistic intervention, we demonstrate that late-layer representations implement a structured geometric code that enables selective causal control over token prediction. Specifically, angular organization of the representation geometry parametrizes prediction distributional similarity, while representation norms encode context-specific information that does not determine prediction. Together, these results provide a mechanistic-geometric account of the dynamics of transforming context into predictions in LLMs.

Depth-Wise Emergence of Prediction-Centric Geometry in Large Language Models

TL;DR

This paper addresses how decoder-only LLMs transform context into predictions by proposing a mechanistic-geometry framework that unifies geometric analysis with causal interventions. It demonstrates a depth-wise transition from context-processing to prediction-forming and identifies a late-layer angular code that parametrizes prediction distribution similarity, while representation norms encode context-specific information that does not determine predictions. The key contributions are the discovery of two distinct computational phases, the evidence of a two-component geometric coding in the prediction-centric phase, and the demonstration that angular geometry in late layers causally controls token identity, supported by both input- and output-centric interventions. The findings have implications for model interpretation and control, suggesting that targeting causally operative angular structure in late layers could enable more reliable and principled manipulation of predictions, with architectural considerations like normalization potentially promoting angular coding.

Abstract

We show that decoder-only large language models exhibit a depth-wise transition from context-processing to prediction-forming phases of computation accompanied by a reorganization of representational geometry. Using a unified framework combining geometric analysis with mechanistic intervention, we demonstrate that late-layer representations implement a structured geometric code that enables selective causal control over token prediction. Specifically, angular organization of the representation geometry parametrizes prediction distributional similarity, while representation norms encode context-specific information that does not determine prediction. Together, these results provide a mechanistic-geometric account of the dynamics of transforming context into predictions in LLMs.
Paper Structure (20 sections, 2 equations, 8 figures)

This paper contains 20 sections, 2 equations, 8 figures.

Figures (8)

  • Figure 1: Per-layer intervention experiment results. Average logit preference difference: blue curves for input based interventions (interval and month), Red curve for output based intervention. The blue vertical line marks the phase-change point.
  • Figure 2: Per-layer participation ratio (PR) of normalized pilot-token representations for ordered long and short sequences, with shuffled counterparts shown as baselines. Top row: identical tokens. Bottom row: non-identical tokens. Columns correspond to models (Llama, Mistral, Qwen, left to right). The blue vertical line marks the perturbation-based phase-change point.
  • Figure 3: Per-layer correlation between non-identical tokens’ pairwise distances and pairwise prediction-distribution symmetric KL divergence, shown across layers. Top row: angular distances. Bottom row: Euclidean distances. Rows correspond to distance metrics, and columns correspond to models. The blue vertical line marks the perturbation-based phase-change point.
  • Figure 4: Per-layer intervention experiment results. Top row: pure angular interventions. Bottom row: pure norm interventions. Average logit preference difference is shown: blue curves correspond to input-based interventions (interval and month), red curves to output-based interventions. The blue vertical line marks the original perturbation-based phase-change point.
  • Figure S1: Visual illustration of two intervention strategies. Each square denotes a token representation. Input-based intervention (blue) perturbs the representation of a selected input token using the representation of another token, effectively modifying the input sequence prior to generation. Output-based intervention (red) directly manipulates the representation of the final token to steer the model’s output toward a desired target, using alternative representations known to produce that output.
  • ...and 3 more figures