PolyG: Adaptive Graph Traversal for Diverse GraphRAG Questions
Renjie Liu, Haitian Jiang, Xiao Yan, Bo Tang, Jinyang Li
TL;DR
This work tackles the problem that GraphRAG evaluation and design have overfitted to KGQA-style questions, proposing a four-pattern taxonomy over knowledge-graph triples $\langle s, p, o\rangle$ to cover diverse real-world queries. It introduces PolyBench, a benchmark built from GRBench graphs with multi-pattern templates and paraphrasing to ensure linguistic and structural diversity, including nested questions decomposed into basic patterns. The authors then present PolyG, an adaptive GraphRAG system that categorizes questions, constructs a query plan, and adaptively prompts LLMs to generate Cypher queries for context retrieval, followed by self-correction and structured context formation. Experimental results show PolyG achieving higher generation-quality win rates, lower latency, and reduced token usage compared to state-of-the-art baselines, across multiple domains and LLMs, underscoring the value of query planning and adaptive traversal in GraphRAG. The work provides open-source code and benchmark data, enabling broader evaluation and development of GraphRAG methods for diverse question patterns and real-world applications.
Abstract
GraphRAG enhances large language models (LLMs) to generate quality answers for user questions by retrieving related facts from external knowledge graphs. However, current GraphRAG methods are primarily evaluated on and overly tailored for knowledge graph question answering (KGQA) benchmarks, which are biased towards a few specific question patterns and do not reflect the diversity of real-world questions. To better evaluate GraphRAG methods, we propose a complete four-class taxonomy to categorize the basic patterns of knowledge graph questions and use it to create PolyBench, a new GraphRAG benchmark encompassing a comprehensive set of graph questions. With the new benchmark, we find that existing GraphRAG methods fall short in effectiveness (i.e., quality of the generated answers) and/or efficiency (i.e., response time or token usage) because they adopt either a fixed graph traversal strategy or free-form exploration by LLMs for fact retrieval. However, different question patterns require distinct graph traversal strategies and context formation. To facilitate better retrieval, we propose PolyG, an adaptive GraphRAG approach by decomposing and categorizing the questions according to our proposed question taxonomy. Built on top of a unified interface and execution engine, PolyG dynamically prompts an LLM to generate a graph database query to retrieve the context for each decomposed basic question. Compared with SOTA GraphRAG methods, PolyG achieves a higher win rate in generation quality and has a low response latency and token cost. Our code and benchmark are open-source at https://github.com/Liu-rj/PolyG.
