Table of Contents
Fetching ...

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Hossein Javidnia

TL;DR

A discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts is developed, making holonomy computable and gauge-invariant.

Abstract

We develop a discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts. Contexts are clustered into a stratified context complex; each chart carries a local feature space and a local information-geometric metric (Fisher/Gauss--Newton) identifying predictively consequential feature interactions. This yields a Fisher-weighted interference energy and three measurable obstructions to global interpretability: (O1) local jamming (active load exceeds Fisher bandwidth), (O2) proxy shearing (mismatch between geometric transport and a fixed correspondence proxy), and (O3) nontrivial holonomy (path-dependent transport around loops). We prove and instantiate four results on a frozen open LLM (Llama~3.2~3B Instruct) using WikiText-103, a C4-derived English web-text subset, and \texttt{the-stack-smol}. (A) After constructive gauge fixing on a spanning tree, each chord residual equals the holonomy of its fundamental cycle, making holonomy computable and gauge-invariant. (B) Shearing lower-bounds a data-dependent transfer mismatch energy, turning $D_{\mathrm{shear}}$ into an unavoidable failure bound. (C) We obtain non-vacuous certified jamming/interference bounds with high coverage and zero violations across seeds/hyperparameters. (D) Bootstrap and sample-size experiments show stable estimation of $D_{\mathrm{shear}}$ and $D_{\mathrm{hol}}$, with improved concentration on well-conditioned subsystems.

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

TL;DR

A discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts is developed, making holonomy computable and gauge-invariant.

Abstract

We develop a discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts. Contexts are clustered into a stratified context complex; each chart carries a local feature space and a local information-geometric metric (Fisher/Gauss--Newton) identifying predictively consequential feature interactions. This yields a Fisher-weighted interference energy and three measurable obstructions to global interpretability: (O1) local jamming (active load exceeds Fisher bandwidth), (O2) proxy shearing (mismatch between geometric transport and a fixed correspondence proxy), and (O3) nontrivial holonomy (path-dependent transport around loops). We prove and instantiate four results on a frozen open LLM (Llama~3.2~3B Instruct) using WikiText-103, a C4-derived English web-text subset, and \texttt{the-stack-smol}. (A) After constructive gauge fixing on a spanning tree, each chord residual equals the holonomy of its fundamental cycle, making holonomy computable and gauge-invariant. (B) Shearing lower-bounds a data-dependent transfer mismatch energy, turning into an unavoidable failure bound. (C) We obtain non-vacuous certified jamming/interference bounds with high coverage and zero violations across seeds/hyperparameters. (D) Bootstrap and sample-size experiments show stable estimation of and , with improved concentration on well-conditioned subsystems.
Paper Structure (55 sections, 10 theorems, 26 equations, 4 figures, 10 tables, 1 algorithm)

This paper contains 55 sections, 10 theorems, 26 equations, 4 figures, 10 tables, 1 algorithm.

Key Result

Proposition 3.1

$G^{(c)}$ is symmetric positive semidefinite.

Figures (4)

  • Figure 1: Holonomy defect distributions (normalised) comparing baseline ($s_{\min}=0$) and persistent ($s_{\min}=0.015$) subsystems.
  • Figure 2: Baseline subsystem: (left) $\widehat{\Delta}_{uv}$ vs lower bound $\widehat{\mathrm{LB}}_{uv}$ (points above diagonal), (middle) $\widehat{\Delta}_{uv}$ vs $D_{\mathrm{shear}}(u,v)$, (right) slack distribution.
  • Figure 3: Persistent subsystem ($s_{\min}=0.015$): the same diagnostics as Figure \ref{['fig:B-base']}.
  • Figure 4: Result C diagnostics (PDF). Left: certificate check $\mathcal{E}_A^{(r)}\ge \widehat{\mathrm{LB}}$. Middle: projected interference energy vs jamming index $J(c)$. Right: certified subset energy vs $J(c)$.

Theorems & Definitions (29)

  • Definition 2.1: Stratified context complex
  • Definition 2.2: Local frame
  • Definition 2.3: Rectangular partial transport
  • Proposition 3.1: PSD
  • proof
  • Definition 3.2: Harm matrix
  • Definition 3.3: Local Fisher-weighted interference energy
  • Proposition 3.4: Nonnegativity and zero set
  • proof
  • Definition 3.5: Effective rank
  • ...and 19 more