A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Hossein Javidnia

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Hossein Javidnia

TL;DR

A discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts is developed, making holonomy computable and gauge-invariant.

Abstract

We develop a discrete gauge-theoretic framework for superposition in large language models (LLMs) that replaces the single-global-dictionary premise with a sheaf-theoretic atlas of local semantic charts. Contexts are clustered into a stratified context complex; each chart carries a local feature space and a local information-geometric metric (Fisher/Gauss--Newton) identifying predictively consequential feature interactions. This yields a Fisher-weighted interference energy and three measurable obstructions to global interpretability: (O1) local jamming (active load exceeds Fisher bandwidth), (O2) proxy shearing (mismatch between geometric transport and a fixed correspondence proxy), and (O3) nontrivial holonomy (path-dependent transport around loops). We prove and instantiate four results on a frozen open LLM (Llama~3.2~3B Instruct) using WikiText-103, a C4-derived English web-text subset, and \texttt{the-stack-smol}. (A) After constructive gauge fixing on a spanning tree, each chord residual equals the holonomy of its fundamental cycle, making holonomy computable and gauge-invariant. (B) Shearing lower-bounds a data-dependent transfer mismatch energy, turning $D_{\mathrm{shear}}$ into an unavoidable failure bound. (C) We obtain non-vacuous certified jamming/interference bounds with high coverage and zero violations across seeds/hyperparameters. (D) Bootstrap and sample-size experiments show stable estimation of $D_{\mathrm{shear}}$ and $D_{\mathrm{hol}}$, with improved concentration on well-conditioned subsystems.

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

TL;DR

Abstract

into an unavoidable failure bound. (C) We obtain non-vacuous certified jamming/interference bounds with high coverage and zero violations across seeds/hyperparameters. (D) Bootstrap and sample-size experiments show stable estimation of

and

, with improved concentration on well-conditioned subsystems.

Paper Structure (55 sections, 10 theorems, 26 equations, 4 figures, 10 tables, 1 algorithm)

This paper contains 55 sections, 10 theorems, 26 equations, 4 figures, 10 tables, 1 algorithm.

Introduction
Contribution and paper thesis
Significance and impact
Related work
Scope.
Setup: context complex, local charts, transports
Activations and clustering
Stratified context complex
Local feature spaces and frames
Rectangular partial transports
Local information geometry and Fisher-weighted interference
Local Fisher/Gauss--Newton metric
Harm matrix
Frame geometry and interference energy
Effective rank and the jamming index
...and 40 more sections

Key Result

Proposition 3.1

$G^{(c)}$ is symmetric positive semidefinite.

Figures (4)

Figure 1: Holonomy defect distributions (normalised) comparing baseline ($s_{\min}=0$) and persistent ($s_{\min}=0.015$) subsystems.
Figure 2: Baseline subsystem: (left) $\widehat{\Delta}_{uv}$ vs lower bound $\widehat{\mathrm{LB}}_{uv}$ (points above diagonal), (middle) $\widehat{\Delta}_{uv}$ vs $D_{\mathrm{shear}}(u,v)$, (right) slack distribution.
Figure 3: Persistent subsystem ($s_{\min}=0.015$): the same diagnostics as Figure \ref{['fig:B-base']}.
Figure 4: Result C diagnostics (PDF). Left: certificate check $\mathcal{E}_A^{(r)}\ge \widehat{\mathrm{LB}}$. Middle: projected interference energy vs jamming index $J(c)$. Right: certified subset energy vs $J(c)$.

Theorems & Definitions (29)

Definition 2.1: Stratified context complex
Definition 2.2: Local frame
Definition 2.3: Rectangular partial transport
Proposition 3.1: PSD
proof
Definition 3.2: Harm matrix
Definition 3.3: Local Fisher-weighted interference energy
Proposition 3.4: Nonnegativity and zero set
proof
Definition 3.5: Effective rank
...and 19 more

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

TL;DR

Abstract

A Gauge Theory of Superposition: Toward a Sheaf-Theoretic Atlas of Neural Representations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (29)