Table of Contents
Fetching ...

Probing the topology of the space of tokens with structured prompts

Michael Robinson, Sourya Dey, Taisa Kushner

TL;DR

The paper tackles recovering the hidden token input embedding subspace $T$ inside the latent space $X$ of an LLM and links its topology to model behavior. It introduces a general structured prompting method and proves via Theorem that, under transversality conditions for the autoregressive map and a measurement map, the collected data embeds $T$ up to a homeomorphism. Empirical results on Llemma-7B show a stratified token subspace with a base dimension around 5–10 and a high-dimensional fiber, and a low-dimensional embedding into $\mathbb{R}^{90}$ can preserve the topology. The approach generalizes to nonlinear autoregressive processes, providing a principled topological lens to analyze and interpret black-box sequence models.

Abstract

This article presents a general and flexible method for prompting a large language model (LLM) to reveal its (hidden) token input embedding up to homeomorphism. Moreover, this article provides strong theoretical justification -- a mathematical proof for generic LLMs -- for why this method should be expected to work. With this method in hand, we demonstrate its effectiveness by recovering the token subspace of Llemma-7B. The results of this paper apply not only to LLMs but also to general nonlinear autoregressive processes.

Probing the topology of the space of tokens with structured prompts

TL;DR

The paper tackles recovering the hidden token input embedding subspace inside the latent space of an LLM and links its topology to model behavior. It introduces a general structured prompting method and proves via Theorem that, under transversality conditions for the autoregressive map and a measurement map, the collected data embeds up to a homeomorphism. Empirical results on Llemma-7B show a stratified token subspace with a base dimension around 5–10 and a high-dimensional fiber, and a low-dimensional embedding into can preserve the topology. The approach generalizes to nonlinear autoregressive processes, providing a principled topological lens to analyze and interpret black-box sequence models.

Abstract

This article presents a general and flexible method for prompting a large language model (LLM) to reveal its (hidden) token input embedding up to homeomorphism. Moreover, this article provides strong theoretical justification -- a mathematical proof for generic LLMs -- for why this method should be expected to work. With this method in hand, we demonstrate its effectiveness by recovering the token subspace of Llemma-7B. The results of this paper apply not only to LLMs but also to general nonlinear autoregressive processes.

Paper Structure

This paper contains 7 sections, 6 theorems, 33 equations, 5 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

(proven in the Appendix) Suppose that $X$ is a smooth manifold, $Y$ is a smooth manifold of dimension $\ell$, that $x_1, \dotsc x_{n-1}$ are elements of $X$, and $Z$ is a submanifold of $X$ of dimension $d$. For smooth functions $f : X^n \to X$ and $g: X \to Y$, the function $\mathcal{A}_m(f,g) : X^ then there is a residual subsetA residual subset is the intersection of countably many open and den

Figures (5)

  • Figure 1: Flowchart of our Algorithm \ref{['alg:token_prompting']}: queries consist of individual tokens entering from the left of the frame, and result in a stream of measurements of tokens from the right. Briefly, $f$ represents the action of the transformer blocks of the LLM, $(\sigma f)$ updates the context window between tokens, and $g$ is the output embedding, which produces probabilities for each of the tokens.
  • Figure 2: Comparison of estimated dimension on a stratified sample using each of the proposed dimension estimators and the dimension estimated directly from the embedding. Note: the local dimension from the known embedding is the "base" dimension, not the fiber dimension; see Figure \ref{['fig:sample_v_r']}.
  • Figure 3: The log-log volume versus radius plot for the token " }" at start of a word obtained from the original embedding (red) and the proposed dimension estimator with Option (1) (blue).
  • Figure 4: Histograms of the local dimensions estimated for all tokens.
  • Figure 5: Comparison of estimated dimension using the proposed dimension estimator with Option (1) and the dimension estimated directly from the embedding

Theorems & Definitions (13)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Proposition 1
  • proof
  • Lemma 1
  • Lemma 2
  • proof
  • Lemma 3
  • proof
  • ...and 3 more