Table of Contents
Fetching ...

`Generalization is hallucination' through the lens of tensor completions

Liang Ze Wong

TL;DR

The paper proposes a tensor-completion framework to unify generalization and hallucination in language generation, formalizing $n$-dimensional data tensors $D^n$ and their completions $D'$, and defining completion artifacts as high-probability, novel $n$-grams outside observed data. It argues that artifacts arise from partially observed training fibers and rank constraints, and that some artifacts may function as useful generalizations while others constitute hallucinations; the boundary depends on external validation data. The authors provide toy experiments showing artifacts proliferate when model size is reduced and discuss analogies to recommender systems, highlighting a generalization-hallination trade-off. They also outline mitigation strategies that involve expanding the training space to include surrounding contexts and adjusting the loss to penalize unsupported high-prob predictions, while noting substantial limitations and the need for scalable methods to quantify artifacts in large models. Overall, the work offers a theoretical lens to connect compression, overfitting, and generation behavior, suggesting concrete avenues for measuring and mitigating artifacts without neglecting beneficial generalization.

Abstract

In this short position paper, we introduce tensor completions and artifacts and make the case that they are a useful theoretical framework for understanding certain types of hallucinations and generalizations in language models.

`Generalization is hallucination' through the lens of tensor completions

TL;DR

The paper proposes a tensor-completion framework to unify generalization and hallucination in language generation, formalizing -dimensional data tensors and their completions , and defining completion artifacts as high-probability, novel -grams outside observed data. It argues that artifacts arise from partially observed training fibers and rank constraints, and that some artifacts may function as useful generalizations while others constitute hallucinations; the boundary depends on external validation data. The authors provide toy experiments showing artifacts proliferate when model size is reduced and discuss analogies to recommender systems, highlighting a generalization-hallination trade-off. They also outline mitigation strategies that involve expanding the training space to include surrounding contexts and adjusting the loss to penalize unsupported high-prob predictions, while noting substantial limitations and the need for scalable methods to quantify artifacts in large models. Overall, the work offers a theoretical lens to connect compression, overfitting, and generation behavior, suggesting concrete avenues for measuring and mitigating artifacts without neglecting beneficial generalization.

Abstract

In this short position paper, we introduce tensor completions and artifacts and make the case that they are a useful theoretical framework for understanding certain types of hallucinations and generalizations in language models.

Paper Structure

This paper contains 11 sections, 2 equations, 3 figures.

Figures (3)

  • Figure 1: Illustration of how tensor completion can give rise to new sentences, or artifacts. (a) A tensor $D^3$ associated to a corpus. Each green box is $1$, and represents a sentence in the corpus. Empty space is $0$. The horizontal plane is the $(t_1, t_2)$-plane, while the vertical axis is $t_3$. (b) Fibers of $D^3$ that are 'seen' by a language model during training. (c) A low-rank completion $D'$ that is consistent with the training fibers, but also has additional artifacts (i.e. new sentences, colored orange). The rank of the completion $D'$ is 2, while the original $D^3$ has rank 3.
  • Figure 2: Number of artifacts against number of triples in the training dataset, over 400 random datasets. Artifacts are triples $(t_1, t_2, t_3)$ which are not in the dataset, but the model still predicts $t_3$ with high probability ($\geq 0.95$) when given $(t_1, t_2)$. The same attention-only model with $n_{layers} = 1, n_{head} = 4, d_{model} = 8$ and $d_{head} = 2$ was used throughout.
  • Figure 3: Number of artifacts against number of non-embedding parameters in 165 models with $n_{head} \leq 4, d_{model} \leq 10$ and $d_{head} \leq 6$. The same dataset with 29 triples and 44 tokens was used, so the maximum possible number of artifacts is $44^2 - 29 = 1,907$ .