Table of Contents
Fetching ...

The Lattice Representation Hypothesis of Large Language Models

Bo Xiong

TL;DR

This framework unifies the Linear Representation Hypothesis with Formal Concept Analysis with Formal Concept Analysis, showing that linear attribute directions with separating thresholds induce a concept lattice via half-space intersections.

Abstract

We propose the Lattice Representation Hypothesis of large language models: a symbolic backbone that grounds conceptual hierarchies and logical operations in embedding geometry. Our framework unifies the Linear Representation Hypothesis with Formal Concept Analysis (FCA), showing that linear attribute directions with separating thresholds induce a concept lattice via half-space intersections. This geometry enables symbolic reasoning through geometric meet (intersection) and join (union) operations, and admits a canonical form when attribute directions are linearly independent. Experiments on WordNet sub-hierarchies provide empirical evidence that LLM embeddings encode concept lattices and their logical structure, revealing a principled bridge between continuous geometry and symbolic abstraction.

The Lattice Representation Hypothesis of Large Language Models

TL;DR

This framework unifies the Linear Representation Hypothesis with Formal Concept Analysis with Formal Concept Analysis, showing that linear attribute directions with separating thresholds induce a concept lattice via half-space intersections.

Abstract

We propose the Lattice Representation Hypothesis of large language models: a symbolic backbone that grounds conceptual hierarchies and logical operations in embedding geometry. Our framework unifies the Linear Representation Hypothesis with Formal Concept Analysis (FCA), showing that linear attribute directions with separating thresholds induce a concept lattice via half-space intersections. This geometry enables symbolic reasoning through geometric meet (intersection) and join (union) operations, and admits a canonical form when attribute directions are linearly independent. Experiments on WordNet sub-hierarchies provide empirical evidence that LLM embeddings encode concept lattices and their logical structure, revealing a principled bridge between continuous geometry and symbolic abstraction.
Paper Structure (40 sections, 9 theorems, 24 equations, 6 figures, 4 tables)

This paper contains 40 sections, 9 theorems, 24 equations, 6 figures, 4 tables.

Key Result

Theorem 1

Let $G$ be a finite set of objects and $M$ a finite set of attributes. Let $V = \{ \mathbf{v}_g \in \mathbb{R}^d \mid g \in G \}$ be object embeddings and $\mathcal{D} = \{ \bar{\ell}_m \in \mathbb{R}^d \mid m \in M \}$ attribute directions. Suppose for each $m \in M$ there exists a threshold $\tau_ Then the induced concept set satisfies: (i) closure under the Galois connection, and (ii) forms a

Figures (6)

  • Figure 1: How LLMs encode conceptual structure. (a) Humans represent concepts as symbols and compose them using logical operators. (b) Under the standard extensional view, LLMs encode concepts as directions, where subsumption is interpreted through the relative orientation of vectors. (c) Under our intensional view, a concept is represented as the intersection of half-spaces defined by its attributes, and compositional semantics emerges through region intersection and union.
  • Figure 2: How FCA connects to the linear lattice geometry of LLMs. (a) A formal context describing which objects satisfy which attributes. (b) The discrete concept lattice constructed exactly from this formal context. (c) The corresponding lattice geometry encoded in LLM embeddings, where each attribute is represented as a linear direction and each object as a point, and concept composition (meet and join) emerges as intersection or union of half-spaces.
  • Figure 3: Distribution of projection lengths for positive and negative objects onto the directions of the first eight attributes (sorted alphabetically) in the WN-Animal dataset.
  • Figure 4: Quantitative evaluation (MRR) of concept algebra for Meet (left) and Join (right) operators.
  • Figure 5: (a) PCA-based visualization of attribute directions in WN-Animal (top 20 most frequent attributes); (b) Performance of LLaMA-3 models of different sizes across WordNet domains.
  • ...and 1 more figures

Theorems & Definitions (22)

  • Definition 1: Large Language Model
  • Definition 2: Linear representation of a binary attribute/concept park2023linear
  • Definition 3: Formal context
  • Definition 4: Formal concept
  • Definition 5: Concept lattice
  • Theorem 1: Existence of Lattice Geometry
  • Proposition 1: Canonical representation
  • Definition 6: Concept as half-space
  • Definition 7: Concept algebra: meet and join
  • Theorem 2: Existence of Lattice Geometry
  • ...and 12 more