Review of Hallucination Understanding in Large Language and Vision Models
Zhengyi Ho, Siyuan Liang, Dacheng Tao
TL;DR
This survey addresses hallucinations in large language and vision models by proposing a unified, modality-agnostic framework (MOWI) that defines hallucinations across four levels: Model, Observer, World, and Input. It systematically catalogs root causes and mechanisms across the model lifecycle—data, architecture, loss/optimisation, evaluation, and inference—linking them to concrete failure modes in LLMs, LVLMs, and TVMs. Key contributions include a formal general definition of hallucinations, a cross-modal taxonomy of causes (e.g., data salience, memorisation, autoregressive constraints, and reward hacking), and a set of principled directions for mitigation (data curation, mechanistic interpretability, robust evaluation, and red-teaming). The findings highlight that hallucinations are not mere outliers but principled consequences of training distributions, architectural priors, and evaluation dynamics, with significant implications for deploying reliable generative AI in real-world settings.
Abstract
The widespread adoption of large language and vision models in real-world applications has made urgent the need to address hallucinations -- instances where models produce incorrect or nonsensical outputs. These errors can propagate misinformation during deployment, leading to both financial and operational harm. Although much research has been devoted to mitigating hallucinations, our understanding of it is still incomplete and fragmented. Without a coherent understanding of hallucinations, proposed solutions risk mitigating surface symptoms rather than underlying causes, limiting their effectiveness and generalizability in deployment. To tackle this gap, we first present a unified, multi-level framework for characterizing both image and text hallucinations across diverse applications, aiming to reduce conceptual fragmentation. We then link these hallucinations to specific mechanisms within a model's lifecycle, using a task-modality interleaved approach to promote a more integrated understanding. Our investigations reveal that hallucinations often stem from predictable patterns in data distributions and inherited biases. By deepening our understanding, this survey provides a foundation for developing more robust and effective solutions to hallucinations in real-world generative AI systems.
