SurveyG: A Multi-Agent LLM Framework with Hierarchical Citation Graph for Automated Survey Generation
Minh-Anh Nguye, Minh-Duc Nguyen, Ha Lan N. T., Kieu Hai Dang, Nguyen Tien Dong, Dung D. Le
TL;DR
SurveyG tackles automated survey generation by embedding papers into a hierarchical citation graph with three layers (Foundation, Development, Frontier) and combining horizontal intra-layer clustering with vertical inter-layer traversal to produce multi-aspect summaries. A two-agent generation framework grounds outline construction and full survey writing in these summaries, followed by a validation loop to ensure coherence, coverage, and factual accuracy. Evaluations on the SurGE benchmark show SurveyG outperforms state-of-the-art baselines across content quality and citation accuracy, demonstrating stronger synthesis, structure, and reliability without relying on human-written inputs. The approach delivers scalable, knowledge-aware surveys aligned with field taxonomy, with practical implications for researchers and evaluators alike.
Abstract
Large language models (LLMs) are increasingly adopted for automating survey paper generation \cite{wang2406autosurvey, liang2025surveyx, yan2025surveyforge,su2025benchmarking,wen2025interactivesurvey}. Existing approaches typically extract content from a large collection of related papers and prompt LLMs to summarize them directly. However, such methods often overlook the structural relationships among papers, resulting in generated surveys that lack a coherent taxonomy and a deeper contextual understanding of research progress. To address these shortcomings, we propose \textbf{SurveyG}, an LLM-based agent framework that integrates \textit{hierarchical citation graph}, where nodes denote research papers and edges capture both citation dependencies and semantic relatedness between their contents, thereby embedding structural and contextual knowledge into the survey generation process. The graph is organized into three layers: \textbf{Foundation}, \textbf{Development}, and \textbf{Frontier}, to capture the evolution of research from seminal works to incremental advances and emerging directions. By combining horizontal search within layers and vertical depth traversal across layers, the agent produces multi-level summaries, which are consolidated into a structured survey outline. A multi-agent validation stage then ensures consistency, coverage, and factual accuracy in generating the final survey. Experiments, including evaluations by human experts and LLM-as-a-judge, demonstrate that SurveyG outperforms state-of-the-art frameworks, producing surveys that are more comprehensive and better structured to the underlying knowledge taxonomy of a field.
