Table of Contents
Fetching ...

Hallucination is Inevitable for LLMs with the Open World Assumption

Bowen Xu

TL;DR

This work reframes LLM hallucinations as manifestations of the generalization problem under open-world conditions, arguing that true inevitability arises from unbounded space, time, and tasks. It introduces a taxonomy distinguishing Type-I (false memorization) and Type-II (false generalization) hallucinations and analyzes their tractability via the Open World versus Closed World assumptions, invoking the No Free Lunch theorem to explain the limits of universal generalization. The paper argues that detectors and purely corrective mechanisms cannot fully solve hallucinations due to required generalization, and proposes constructive strategies: tolerate hallucinations as a structural feature, maintain adaptivity with limited experience, and render errors intelligible through transparent, human-aligned representations. Collectively, the approach shifts the goal from eradicating all hallucinations to designing AGI systems that operate effectively in open environments, balancing accuracy with interpretability and adaptability. These insights have practical implications for building robust, human-aligned AI that can function under uncertain, unbounded real-world conditions.

Abstract

Large Language Models (LLMs) exhibit impressive linguistic competence but also produce inaccurate or fabricated outputs, often called ``hallucinations''. Engineering approaches usually regard hallucination as a defect to be minimized, while formal analyses have argued for its theoretical inevitability. Yet both perspectives remain incomplete when considering the conditions required for artificial general intelligence (AGI). This paper reframes ``hallucination'' as a manifestation of the generalization problem. Under the Closed World assumption, where training and test distributions are consistent, hallucinations may be mitigated. Under the Open World assumption, however, where the environment is unbounded, hallucinations become inevitable. This paper further develops a classification of hallucination, distinguishing cases that may be corrected from those that appear unavoidable under open-world conditions. On this basis, it suggests that ``hallucination'' should be approached not merely as an engineering defect but as a structural feature to be tolerated and made compatible with human intelligence.

Hallucination is Inevitable for LLMs with the Open World Assumption

TL;DR

This work reframes LLM hallucinations as manifestations of the generalization problem under open-world conditions, arguing that true inevitability arises from unbounded space, time, and tasks. It introduces a taxonomy distinguishing Type-I (false memorization) and Type-II (false generalization) hallucinations and analyzes their tractability via the Open World versus Closed World assumptions, invoking the No Free Lunch theorem to explain the limits of universal generalization. The paper argues that detectors and purely corrective mechanisms cannot fully solve hallucinations due to required generalization, and proposes constructive strategies: tolerate hallucinations as a structural feature, maintain adaptivity with limited experience, and render errors intelligible through transparent, human-aligned representations. Collectively, the approach shifts the goal from eradicating all hallucinations to designing AGI systems that operate effectively in open environments, balancing accuracy with interpretability and adaptability. These insights have practical implications for building robust, human-aligned AI that can function under uncertain, unbounded real-world conditions.

Abstract

Large Language Models (LLMs) exhibit impressive linguistic competence but also produce inaccurate or fabricated outputs, often called ``hallucinations''. Engineering approaches usually regard hallucination as a defect to be minimized, while formal analyses have argued for its theoretical inevitability. Yet both perspectives remain incomplete when considering the conditions required for artificial general intelligence (AGI). This paper reframes ``hallucination'' as a manifestation of the generalization problem. Under the Closed World assumption, where training and test distributions are consistent, hallucinations may be mitigated. Under the Open World assumption, however, where the environment is unbounded, hallucinations become inevitable. This paper further develops a classification of hallucination, distinguishing cases that may be corrected from those that appear unavoidable under open-world conditions. On this basis, it suggests that ``hallucination'' should be approached not merely as an engineering defect but as a structural feature to be tolerated and made compatible with human intelligence.

Paper Structure

This paper contains 12 sections, 6 figures.

Figures (6)

  • Figure 1: Illustration of a learned mapping. Black dots represent training samples, while the orange curve denotes the mapping acquired by an LLM. The reduction to two dimensions is for intuitive visualization; the underlying analysis applies equally to high-dimensional spaces.
  • Figure 2: A generated output marked "H" (dashed circle) deviates from the training samples, and may therefore be interpreted as a hallucination.
  • Figure 3: The same output marked "H" in Fig. \ref{['fig:case2']} could also correspond to a true fact, represented here as a purple dot "F". From the learner's perspective during training, it is indeterminate which mapping is superior.
  • Figure 4: Type-I Hallucination (HT-I). The output "H" deviates from a known training sample "F". Since the fact is already contained in the training set, the mapping can be revised until the output aligns with the expected value.
  • Figure 5: Illustration of the inadequacy of abandoning generalization. The output "U" corresponds to an "I don’t know" response.
  • ...and 1 more figures