A Survey of Large Language Model Agents for Question Answering
Murong Yue
TL;DR
This survey surveys the rise of LLM-based agents for question answering, contrasting them with traditional QA pipelines and naive LLM QA. It formalizes the agent architecture with memory, planning, and inner thinking, and maps QA tasks to stages: planning, question understanding, information retrieval, answer generation, and follow-up. The authors provide a taxonomy-centered review of datasets, methods (prompting and tuning-based planning, slotting, query expansion/reformulation, IR strategies, tool-augmented generation, and multi-turn interaction), and highlight open challenges like benchmarking, hallucination, calibration, and autonomous tool use. They also discuss future directions, including integrating LLMs into indexing, improving reasoning via memory and causal methods, and designing more capable, self-optimizing agents. Overall, the paper articulates a structured roadmap for advancing LLM-driven QA agents and their practical deployment.
Abstract
This paper surveys the development of large language model (LLM)-based agents for question answering (QA). Traditional agents face significant limitations, including substantial data requirements and difficulty in generalizing to new environments. LLM-based agents address these challenges by leveraging LLMs as their core reasoning engine. These agents achieve superior QA results compared to traditional QA pipelines and naive LLM QA systems by enabling interaction with external environments. We systematically review the design of LLM agents in the context of QA tasks, organizing our discussion across key stages: planning, question understanding, information retrieval, and answer generation. Additionally, this paper identifies ongoing challenges and explores future research directions to enhance the performance of LLM agent QA systems.
