Table of Contents
Fetching ...

Review of Hallucination Understanding in Large Language and Vision Models

Zhengyi Ho, Siyuan Liang, Dacheng Tao

TL;DR

This survey addresses hallucinations in large language and vision models by proposing a unified, modality-agnostic framework (MOWI) that defines hallucinations across four levels: Model, Observer, World, and Input. It systematically catalogs root causes and mechanisms across the model lifecycle—data, architecture, loss/optimisation, evaluation, and inference—linking them to concrete failure modes in LLMs, LVLMs, and TVMs. Key contributions include a formal general definition of hallucinations, a cross-modal taxonomy of causes (e.g., data salience, memorisation, autoregressive constraints, and reward hacking), and a set of principled directions for mitigation (data curation, mechanistic interpretability, robust evaluation, and red-teaming). The findings highlight that hallucinations are not mere outliers but principled consequences of training distributions, architectural priors, and evaluation dynamics, with significant implications for deploying reliable generative AI in real-world settings.

Abstract

The widespread adoption of large language and vision models in real-world applications has made urgent the need to address hallucinations -- instances where models produce incorrect or nonsensical outputs. These errors can propagate misinformation during deployment, leading to both financial and operational harm. Although much research has been devoted to mitigating hallucinations, our understanding of it is still incomplete and fragmented. Without a coherent understanding of hallucinations, proposed solutions risk mitigating surface symptoms rather than underlying causes, limiting their effectiveness and generalizability in deployment. To tackle this gap, we first present a unified, multi-level framework for characterizing both image and text hallucinations across diverse applications, aiming to reduce conceptual fragmentation. We then link these hallucinations to specific mechanisms within a model's lifecycle, using a task-modality interleaved approach to promote a more integrated understanding. Our investigations reveal that hallucinations often stem from predictable patterns in data distributions and inherited biases. By deepening our understanding, this survey provides a foundation for developing more robust and effective solutions to hallucinations in real-world generative AI systems.

Review of Hallucination Understanding in Large Language and Vision Models

TL;DR

This survey addresses hallucinations in large language and vision models by proposing a unified, modality-agnostic framework (MOWI) that defines hallucinations across four levels: Model, Observer, World, and Input. It systematically catalogs root causes and mechanisms across the model lifecycle—data, architecture, loss/optimisation, evaluation, and inference—linking them to concrete failure modes in LLMs, LVLMs, and TVMs. Key contributions include a formal general definition of hallucinations, a cross-modal taxonomy of causes (e.g., data salience, memorisation, autoregressive constraints, and reward hacking), and a set of principled directions for mitigation (data curation, mechanistic interpretability, robust evaluation, and red-teaming). The findings highlight that hallucinations are not mere outliers but principled consequences of training distributions, architectural priors, and evaluation dynamics, with significant implications for deploying reliable generative AI in real-world settings.

Abstract

The widespread adoption of large language and vision models in real-world applications has made urgent the need to address hallucinations -- instances where models produce incorrect or nonsensical outputs. These errors can propagate misinformation during deployment, leading to both financial and operational harm. Although much research has been devoted to mitigating hallucinations, our understanding of it is still incomplete and fragmented. Without a coherent understanding of hallucinations, proposed solutions risk mitigating surface symptoms rather than underlying causes, limiting their effectiveness and generalizability in deployment. To tackle this gap, we first present a unified, multi-level framework for characterizing both image and text hallucinations across diverse applications, aiming to reduce conceptual fragmentation. We then link these hallucinations to specific mechanisms within a model's lifecycle, using a task-modality interleaved approach to promote a more integrated understanding. Our investigations reveal that hallucinations often stem from predictable patterns in data distributions and inherited biases. By deepening our understanding, this survey provides a foundation for developing more robust and effective solutions to hallucinations in real-world generative AI systems.

Paper Structure

This paper contains 37 sections, 1 equation, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Overview of the paper. The first two sections provide an overview of the topic and related works. Section \ref{['sec:definitions']} defines key terms related to hallucinations and the model types under discussion. Section \ref{['sec:root-causes-and-mechanisms']} offers an in-depth review of the root causes and mechanisms of hallucinations. The final three sections build on these foundations to distil key insights, evaluate their broader implications, and propose future directions.
  • Figure 2: A timeline of representative works in the past four years exploring the understanding of failure modes in LLMs, LVLMs, and TVMs through a variety of methodological and theoretical lenses.
  • Figure 3: Hallucinations root causes from training data factors. The and *[regular] icons indicate discussed modalities.
  • Figure 4: Hallucinations root causes from architectural limitations. The and *[regular] icons indicate discussed modalities.
  • Figure 5: Hallucinations root causes from inference mechanisms. The and *[regular] icons indicate discussed modalities.
  • ...and 2 more figures