Table of Contents
Fetching ...

Beyond Static Retrieval: Opportunities and Pitfalls of Iterative Retrieval in GraphRAG

Kai Guo, Xinnan Dai, Shenglai Zeng, Harry Shomer, Haoyu Han, Yu Wang, Jiliang Tang

TL;DR

This paper investigates iterative retrieval within GraphRAG to address the brittleness of multi-hop QA when bridge evidence is missing. Through a systematic study across multiple GraphRAG backbones and iterative strategies, it uncovers that iteration helps surface bridge documents and improves performance on complex questions, but can introduce noise and leaves some key evidence buried deep. To tackle this bottleneck, it introduces Bridge-Guided Dual-Thought-based Retrieval (BDTR), which combines Dual-Thought-based Retrieval to diversify evidence and Bridge-Guided Evidence Calibration to promote bridge-bearing documents into top ranks. BDTR delivers consistent improvements across diverse backbones and datasets, offering practical guidance for designing more robust GraphRAG systems.

Abstract

Retrieval-augmented generation (RAG) is a powerful paradigm for improving large language models (LLMs) on knowledge-intensive question answering. Graph-based RAG (GraphRAG) leverages entity-relation graphs to support multi-hop reasoning, but most systems still rely on static retrieval. When crucial evidence, especially bridge documents that connect disjoint entities, is absent, reasoning collapses and hallucinations persist. Iterative retrieval, which performs multiple rounds of evidence selection, has emerged as a promising alternative, yet its role within GraphRAG remains poorly understood. We present the first systematic study of iterative retrieval in GraphRAG, analyzing how different strategies interact with graph-based backbones and under what conditions they succeed or fail. Our findings reveal clear opportunities: iteration improves complex multi-hop questions, helps promote bridge documents into leading ranks, and different strategies offer complementary strengths. At the same time, pitfalls remain: naive expansion often introduces noise that reduces precision, gains are limited on single-hop or simple comparison questions, and several bridge evidences still be buried too deep to be effectively used. Together, these results highlight a central bottleneck, namely that GraphRAG's effectiveness depends not only on recall but also on whether bridge evidence is consistently promoted into leading positions where it can support reasoning chains. To address this challenge, we propose Bridge-Guided Dual-Thought-based Retrieval (BDTR), a simple yet effective framework that generates complementary thoughts and leverages reasoning chains to recalibrate rankings and bring bridge evidence into leading positions. BDTR achieves consistent improvements across diverse GraphRAG settings and provides guidance for the design of future GraphRAG systems.

Beyond Static Retrieval: Opportunities and Pitfalls of Iterative Retrieval in GraphRAG

TL;DR

This paper investigates iterative retrieval within GraphRAG to address the brittleness of multi-hop QA when bridge evidence is missing. Through a systematic study across multiple GraphRAG backbones and iterative strategies, it uncovers that iteration helps surface bridge documents and improves performance on complex questions, but can introduce noise and leaves some key evidence buried deep. To tackle this bottleneck, it introduces Bridge-Guided Dual-Thought-based Retrieval (BDTR), which combines Dual-Thought-based Retrieval to diversify evidence and Bridge-Guided Evidence Calibration to promote bridge-bearing documents into top ranks. BDTR delivers consistent improvements across diverse backbones and datasets, offering practical guidance for designing more robust GraphRAG systems.

Abstract

Retrieval-augmented generation (RAG) is a powerful paradigm for improving large language models (LLMs) on knowledge-intensive question answering. Graph-based RAG (GraphRAG) leverages entity-relation graphs to support multi-hop reasoning, but most systems still rely on static retrieval. When crucial evidence, especially bridge documents that connect disjoint entities, is absent, reasoning collapses and hallucinations persist. Iterative retrieval, which performs multiple rounds of evidence selection, has emerged as a promising alternative, yet its role within GraphRAG remains poorly understood. We present the first systematic study of iterative retrieval in GraphRAG, analyzing how different strategies interact with graph-based backbones and under what conditions they succeed or fail. Our findings reveal clear opportunities: iteration improves complex multi-hop questions, helps promote bridge documents into leading ranks, and different strategies offer complementary strengths. At the same time, pitfalls remain: naive expansion often introduces noise that reduces precision, gains are limited on single-hop or simple comparison questions, and several bridge evidences still be buried too deep to be effectively used. Together, these results highlight a central bottleneck, namely that GraphRAG's effectiveness depends not only on recall but also on whether bridge evidence is consistently promoted into leading positions where it can support reasoning chains. To address this challenge, we propose Bridge-Guided Dual-Thought-based Retrieval (BDTR), a simple yet effective framework that generates complementary thoughts and leverages reasoning chains to recalibrate rankings and bring bridge evidence into leading positions. BDTR achieves consistent improvements across diverse GraphRAG settings and provides guidance for the design of future GraphRAG systems.

Paper Structure

This paper contains 28 sections, 6 equations, 13 figures, 5 tables, 1 algorithm.

Figures (13)

  • Figure 1: EM Comparison on Multi-hop QA Datasets.
  • Figure 2: EM Comparison by Question Type.
  • Figure 3: Complementary.
  • Figure 4: EM Comparison on Multi-hop QA Datasets.
  • Figure 5: Illustration of our framework BDTR, shown here with two iterations as an example. In each reasoning step, the model generates two thoughts to drive retrieval and constructs a reasoning chain that encodes intermediate bridge cues. The retrieved documents from the two thoughts provide diverse and complementary evidence, while the bridge-guided calibration module adjusts their ranking to ensure that critical bridge facts appear in leading position for reasoning.
  • ...and 8 more figures