An Audit on the Perspectives and Challenges of Hallucinations in NLP
Pranav Narayanan Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson
TL;DR
This audit exposes a fragmented landscape around NLP hallucination, showing widespread variability in definitions, frameworks, and metrics across 103 papers, complemented by a practitioner survey of 171 researchers. By mapping seven NLP-subfields and 31 conceptual frameworks, the study reveals a lack of consensus and minimal engagement with sociotechnical perspectives. It contributes an explicit call for standardized terminology, transparent methodological documentation, and sociotechnical framing, accompanied by dual recommendations for authors and the research community. The work advances practical guidance for reducing misinterpretation and societal risk from generative models, with implications for funding, evaluation, and governance of AI systems.
Abstract
We audit how hallucination in large language models (LLMs) is characterized in peer-reviewed literature, using a critical examination of 103 publications across NLP research. Through the examination of the literature, we identify a lack of agreement with the term `hallucination' in the field of NLP. Additionally, to compliment our audit, we conduct a survey with 171 practitioners from the field of NLP and AI to capture varying perspectives on hallucination. Our analysis calls for the necessity of explicit definitions and frameworks outlining hallucination within NLP, highlighting potential challenges, and our survey inputs provide a thematic understanding of the influence and ramifications of hallucination in society.
