How do Language Models Bind Entities in Context?

Jiahai Feng; Jacob Steinhardt

How do Language Models Bind Entities in Context?

Jiahai Feng, Jacob Steinhardt

TL;DR

Language models need to bind entities to their contextual attributes; the authors propose Binding ID as a general internal mechanism and validate it with causal mediation analyses. They show bindings are implemented as additive binding functions that attach to binding ID vectors forming a continuous subspace, and that these IDs generalize across tasks and scale, enabling transfer and robust in-context reasoning. They also identify a non-universal direct-binding mechanism in MCQ tasks, highlighting limits of universality. The work advances interpretability of in-context reasoning and suggests scalable, transferrable symbolic representations emerge in large LMs.

Abstract

To correctly use in-context information, language models (LMs) must bind entities to their attributes. For example, given a context describing a "green square" and a "blue circle", LMs must bind the shapes to their respective colors. We analyze LM representations and identify the binding ID mechanism: a general mechanism for solving the binding problem, which we observe in every sufficiently large model from the Pythia and LLaMA families. Using causal interventions, we show that LMs' internal activations represent binding information by attaching binding ID vectors to corresponding entities and attributes. We further show that binding ID vectors form a continuous subspace, in which distances between binding ID vectors reflect their discernability. Overall, our results uncover interpretable strategies in LMs for representing symbolic knowledge in-context, providing a step towards understanding general in-context reasoning in large-scale LMs.

How do Language Models Bind Entities in Context?

TL;DR

Abstract

Paper Structure (27 sections, 9 equations, 18 figures, 4 tables)

This paper contains 27 sections, 9 equations, 18 figures, 4 tables.

Introduction
Preliminaries
Existence of Binding IDs
Factorizability of activations
Position independence
Structure of Binding ID
Additivity of Binding Functions
The Geometry of Binding ID Vectors
Generality and Limitations of Binding ID
Related work
Conclusion
Evaluation details
Necessity of Binding ID mechanism
Details for Position Independence
Binding Task Details
...and 12 more sections

Figures (18)

Figure 1: The Binding ID mechanism. The LM learns abstract binding IDs (drawn as triangles or squares) which distinguish between entity-attribute pairs. Binding functions $\Gamma_E$ and $\Gamma_A$ bind entities and attributes to their abstract binding ID, and store the results in the activations. To answer queries, the LM identifies the attribute that shares the same binding ID as the queried entity.
Figure 2: a) Causal diagram for autoregressive LMs. From input context $\mathop{\mathrm{ctxt}}\nolimits(e_0\leftrightarrow a_0, e_1\leftrightarrow a_1)$, the LM constructs internal representations $Z_{\text{context}}$. We will mainly study the components of $Z_{\text{context}}$ boxed in blue. b) A secondary run of the LM on context $\mathop{\mathrm{ctxt}}\nolimits(e_2\leftrightarrow a_2, e_3\leftrightarrow a_3)$ to produce $Z_{\text{context}}'$. c) An example intervention where $Z_{\text{context}}$ is modified by replacing $Z_{A_0} \rightarrow Z_{A_0}'$ from $Z_{\text{context}}'$.
Figure 3: Factorizability results. Each row corresponds to querying for a particular entity. Plotted are the mean log prob for all four attributes. Highlighted squares are predicted by factorizability.
Figure 4: Top: Mean log probs for entity interventions. Bottom: Mean log probs for attributes. For brevity, let $Z_k$ refer to $Z_{E_k}$ or $Z_{A_k}$. The grey and green vertical lines indicate the original positions for $Z_0$ and $Z_1$ respectively. The x-axis marks $x$, $Z_0$'s new position. Under the position interventions $\{X_{0} \rightarrow x, X_1 \rightarrow X_1 - (x - X_{0})\}$, the grey line is the control condition with no interventions, and the green line is the swapped condition where $Z_0$ and $Z_1$ have swapped positions.
Figure 5: The plots show the mean median-calibrated accuracy when one pair of binding ID, $v_0$, is fixed at the green circle, and the other, $v_1$, is varied across the grid. The binding IDs $b(0)$, $b(1)$, and $b(2)$ are shown as the origin of the arrows, the end of the horizontal arrow, and the end of the diagonal arrow respectively. We use LLaMA-13b for computational reasons.
...and 13 more figures

How do Language Models Bind Entities in Context?

TL;DR

Abstract

How do Language Models Bind Entities in Context?

Authors

TL;DR

Abstract

Table of Contents

Figures (18)