Table of Contents
Fetching ...

ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning over Knowledge Graphs

Minbae Park, Hyemin Yang, Jeonghyun Kim, Kunsoo Park, Hyunjoon Kim

TL;DR

ProgRAG tackles hallucination and unreliability in knowledge-graph question answering by introducing a progressive retrieval and reasoning framework. It decomposes complex questions into sub-questions, uses external retrievers to gather evidence, prunes candidates with uncertainty-aware guidance, and reorganizes partial reasoning paths into structured prefixes for final LLM inference. The approach achieves state-of-the-art accuracy on WebQSP, CWQ, and CR-LT without fine-tuning, and demonstrates superior reasoning path coverage and efficiency. These contributions offer more reliable, scalable KGQA with grounded, multi-hop reasoning capabilities.

Abstract

Large Language Models (LLMs) demonstrate strong reasoning capabilities but struggle with hallucinations and limited transparency. Recently, KG-enhanced LLMs that integrate knowledge graphs (KGs) have been shown to improve reasoning performance, particularly for complex, knowledge-intensive tasks. However, these methods still face significant challenges, including inaccurate retrieval and reasoning failures, often exacerbated by long input contexts that obscure relevant information or by context constructions that struggle to capture the richer logical directions required by different question types. Furthermore, many of these approaches rely on LLMs to directly retrieve evidence from KGs, and to self-assess the sufficiency of this evidence, which often results in premature or incorrect reasoning. To address the retrieval and reasoning failures, we propose ProgRAG, a multi-hop knowledge graph question answering (KGQA) framework that decomposes complex questions into sub-questions, and progressively extends partial reasoning paths by answering each sub-question. At each step, external retrievers gather candidate evidence, which is then refined through uncertainty-aware pruning by the LLM. Finally, the context for LLM reasoning is optimized by organizing and rearranging the partial reasoning paths obtained from the sub-question answers. Experiments on three well-known datasets demonstrate that ProgRAG outperforms existing baselines in multi-hop KGQA, offering improved reliability and reasoning quality.

ProgRAG: Hallucination-Resistant Progressive Retrieval and Reasoning over Knowledge Graphs

TL;DR

ProgRAG tackles hallucination and unreliability in knowledge-graph question answering by introducing a progressive retrieval and reasoning framework. It decomposes complex questions into sub-questions, uses external retrievers to gather evidence, prunes candidates with uncertainty-aware guidance, and reorganizes partial reasoning paths into structured prefixes for final LLM inference. The approach achieves state-of-the-art accuracy on WebQSP, CWQ, and CR-LT without fine-tuning, and demonstrates superior reasoning path coverage and efficiency. These contributions offer more reliable, scalable KGQA with grounded, multi-hop reasoning capabilities.

Abstract

Large Language Models (LLMs) demonstrate strong reasoning capabilities but struggle with hallucinations and limited transparency. Recently, KG-enhanced LLMs that integrate knowledge graphs (KGs) have been shown to improve reasoning performance, particularly for complex, knowledge-intensive tasks. However, these methods still face significant challenges, including inaccurate retrieval and reasoning failures, often exacerbated by long input contexts that obscure relevant information or by context constructions that struggle to capture the richer logical directions required by different question types. Furthermore, many of these approaches rely on LLMs to directly retrieve evidence from KGs, and to self-assess the sufficiency of this evidence, which often results in premature or incorrect reasoning. To address the retrieval and reasoning failures, we propose ProgRAG, a multi-hop knowledge graph question answering (KGQA) framework that decomposes complex questions into sub-questions, and progressively extends partial reasoning paths by answering each sub-question. At each step, external retrievers gather candidate evidence, which is then refined through uncertainty-aware pruning by the LLM. Finally, the context for LLM reasoning is optimized by organizing and rearranging the partial reasoning paths obtained from the sub-question answers. Experiments on three well-known datasets demonstrate that ProgRAG outperforms existing baselines in multi-hop KGQA, offering improved reliability and reasoning quality.

Paper Structure

This paper contains 43 sections, 3 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Comparison between existing KG-enhanced LLMs and the proposed framework. E, I, and C denote External, Internal, and Core Supporting Evidences, respectively, and UQ in (b) indicates uncertainty quantification.
  • Figure 2: Comparison of typical error cases in existing methods versus the proposed framework.
  • Figure 3: ProgRAG operates in three stages: (1) Question decomposition, where the question is split into sub-questions based on a key entity; (2) Sub-question answering, where partial reasoning paths are progressively extended through retrieval and pruning; and (3) Prefix enumeration and repacking, where all prefixes of the reasoning paths are enumerated and reordered. Finally, the LLM infers the answer based on these prefixes.
  • Figure 4: Explored path overlap ratio on CWQ.
  • Figure 5: Uncertainty trends by triple input size on the WebQSP dataset.
  • ...and 3 more figures