Investigating Representation Universality: Case Study on Genealogical Representations

David D. Baek; Yuxiao Li; Max Tegmark

Investigating Representation Universality: Case Study on Genealogical Representations

David D. Baek, Yuxiao Li, Max Tegmark

TL;DR

The paper investigates whether LLMs encode discrete graph-structured knowledge using universal geometric representations. It uses two complementary approaches: a cone-probe analysis of in-context genealogy tasks to identify tree-like subspaces and activation patching to test causality, and cross-model stitching across diverse architectures to assess representational alignment. The findings show emergent tree-like cone embeddings in residual activations and stronger alignment in early-to-mid layers across models, supporting the universality hypothesis while acknowledging limitations due to small graphs and lack of ground-truth representations. These results advance interpretability by suggesting generalizable geometric structures in LLMs and point to future work on larger graphs and uncertainty estimation. Overall, understanding these representations could inform the design of more interpretable, robust, and controllable AI systems.

Abstract

Motivated by interpretability and reliability, we investigate whether large language models (LLMs) deploy universal geometric structures to encode discrete, graph-structured knowledge. To this end, we present two complementary experimental evidence that might support universality of graph representations. First, on an in-context genealogy Q&A task, we train a cone probe to isolate a tree-like subspace in residual stream activations and use activation patching to verify its causal effect in answering related questions. We validate our findings across five different models. Second, we conduct model stitching experiments across models of diverse architectures and parameter counts (OPT, Pythia, Mistral, and LLaMA, 410 million to 8 billion parameters), quantifying representational alignment via relative degradation in the next-token prediction loss. Generally, we conclude that the lack of ground truth representations of graphs makes it challenging to study how LLMs represent them. Ultimately, improving our understanding of LLM representations could facilitate the development of more interpretable, robust, and controllable AI systems.

Investigating Representation Universality: Case Study on Genealogical Representations

TL;DR

Abstract

Investigating Representation Universality: Case Study on Genealogical Representations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)