LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
Xixun Lin, Yucheng Ning, Jingwen Zhang, Yan Dong, Yilong Liu, Yongxuan Wu, Xiaohua Qi, Nan Sun, Yanmin Shang, Kun Wang, Pengfei Cao, Qingyue Wang, Lixin Zou, Xu Chen, Chuan Zhou, Jia Wu, Peng Zhang, Qingsong Wen, Shirui Pan, Bin Wang, Yanan Cao, Kai Chen, Songlin Hu, Li Guo
TL;DR
The paper addresses hallucinations in LLM-based agents by modeling the problem as a multi-component system where errors can propagate across reasoning, execution, perception, memory, and communication. It introduces an internal-state vs external-behavior taxonomy, defining five hallucination types with nine sub-types and eighteen triggering causes, and provides a comprehensive review of ten mitigation/detection approaches. The methodology covers knowledge-based and paradigm-based strategies, including post-hoc verification and multi-agent considerations within a $POMDP$ framework. By detailing theoretical foundations, practical mitigation/detection techniques, and future directions, the work offers a practical roadmap for building safer, more reliable LLM-based agents and provides open resources to accelerate progress.
Abstract
Driven by the rapid advancements of Large Language Models (LLMs), LLM-based agents have emerged as powerful intelligent systems capable of human-like cognition, reasoning, and interaction. These agents are increasingly being deployed across diverse real-world applications, including student education, scientific research, and financial analysis. However, despite their remarkable potential, LLM-based agents remain vulnerable to hallucination issues, which can result in erroneous task execution and undermine the reliability of the overall system design. Addressing this critical challenge requires a deep understanding and a systematic consolidation of recent advances on LLM-based agents. To this end, we present the first comprehensive survey of hallucinations in LLM-based agents. By carefully analyzing the complete workflow of agents, we propose a new taxonomy that identifies different types of agent hallucinations occurring at different stages. Furthermore, we conduct an in-depth examination of eighteen triggering causes underlying the emergence of agent hallucinations. Through a detailed review of a large number of existing studies, we summarize approaches for hallucination mitigation and detection, and highlight promising directions for future research. We hope this survey will inspire further efforts toward addressing hallucinations in LLM-based agents, ultimately contributing to the development of more robust and reliable agent systems.
