Table of Contents
Fetching ...

Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

Xingyu Tan, Xiaoyang Wang, Qing Liu, Xiwei Xu, Xin Yuan, Wenjie Zhang

TL;DR

Paths-over-Graph (PoG) presents a KG-enhanced reasoning framework that augments large language models with knowledge graph reasoning paths to improve faithfulness and interpretability in KGQA. It employs a dynamic three-phase exploration (topic-path, supplement-path, and node-expand) guided by a predicted depth, coupled with a three-step beam search pruning that leverages graph structure, LLM prompting, and SBERT-based similarity. The approach yields state-of-the-art results across five KGQA benchmarks, outperforming strong prompting-based and fine-tuned baselines, and demonstrates robustness for multi-hop and multi-entity questions while reducing LLM calls and token usage. The combination of graph-aware pruning, path summarization, and evidence-grounded reasoning provides interpretable chains of thought and practical gains in accuracy and efficiency for knowledge-intensive reasoning tasks.

Abstract

Large Language Models (LLMs) have achieved impressive results in various tasks but struggle with hallucination problems and lack of relevant knowledge, especially in deep complex reasoning and knowledge-intensive tasks. Knowledge Graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. PoG tackles multi-hop and multi-entity questions through a three-phase dynamic multi-hop path exploration, which combines the inherent knowledge of LLMs with factual knowledge from KGs. In order to improve the efficiency, PoG prunes irrelevant information from the graph exploration first and introduces efficient three-step pruning techniques that incorporate graph structures, LLM prompting, and a pre-trained language model (e.g., SBERT) to effectively narrow down the explored candidate paths. This ensures all reasoning paths contain highly relevant information captured from KGs, making the reasoning faithful and interpretable in problem-solving. PoG innovatively utilizes graph structure to prune the irrelevant noise and represents the first method to implement multi-entity deep path detection on KGs for LLM reasoning tasks. Comprehensive experiments on five benchmark KGQA datasets demonstrate PoG outperforms the state-of-the-art method ToG across GPT-3.5-Turbo and GPT-4, achieving an average accuracy improvement of 18.9%. Notably, PoG with GPT-3.5-Turbo surpasses ToG with GPT-4 by up to 23.9%.

Paths-over-Graph: Knowledge Graph Empowered Large Language Model Reasoning

TL;DR

Paths-over-Graph (PoG) presents a KG-enhanced reasoning framework that augments large language models with knowledge graph reasoning paths to improve faithfulness and interpretability in KGQA. It employs a dynamic three-phase exploration (topic-path, supplement-path, and node-expand) guided by a predicted depth, coupled with a three-step beam search pruning that leverages graph structure, LLM prompting, and SBERT-based similarity. The approach yields state-of-the-art results across five KGQA benchmarks, outperforming strong prompting-based and fine-tuned baselines, and demonstrates robustness for multi-hop and multi-entity questions while reducing LLM calls and token usage. The combination of graph-aware pruning, path summarization, and evidence-grounded reasoning provides interpretable chains of thought and practical gains in accuracy and efficiency for knowledge-intensive reasoning tasks.

Abstract

Large Language Models (LLMs) have achieved impressive results in various tasks but struggle with hallucination problems and lack of relevant knowledge, especially in deep complex reasoning and knowledge-intensive tasks. Knowledge Graphs (KGs), which capture vast amounts of facts in a structured format, offer a reliable source of knowledge for reasoning. However, existing KG-based LLM reasoning methods face challenges like handling multi-hop reasoning, multi-entity questions, and effectively utilizing graph structures. To address these issues, we propose Paths-over-Graph (PoG), a novel method that enhances LLM reasoning by integrating knowledge reasoning paths from KGs, improving the interpretability and faithfulness of LLM outputs. PoG tackles multi-hop and multi-entity questions through a three-phase dynamic multi-hop path exploration, which combines the inherent knowledge of LLMs with factual knowledge from KGs. In order to improve the efficiency, PoG prunes irrelevant information from the graph exploration first and introduces efficient three-step pruning techniques that incorporate graph structures, LLM prompting, and a pre-trained language model (e.g., SBERT) to effectively narrow down the explored candidate paths. This ensures all reasoning paths contain highly relevant information captured from KGs, making the reasoning faithful and interpretable in problem-solving. PoG innovatively utilizes graph structure to prune the irrelevant noise and represents the first method to implement multi-entity deep path detection on KGs for LLM reasoning tasks. Comprehensive experiments on five benchmark KGQA datasets demonstrate PoG outperforms the state-of-the-art method ToG across GPT-3.5-Turbo and GPT-4, achieving an average accuracy improvement of 18.9%. Notably, PoG with GPT-3.5-Turbo surpasses ToG with GPT-4 by up to 23.9%.

Paper Structure

This paper contains 29 sections, 4 equations, 13 figures, 10 tables, 2 algorithms.

Figures (13)

  • Figure 1: Representative workflow of four LLM reasoning paradigms.
  • Figure 2: Overview of the PoG architecture. Exploration: After initialization (detailed in Figure \ref{['fig:initial']}), the model retrieves entity paths from $\mathcal{G}_q$ through three exploration phases. Path Pruning: PoG applies a three-step beam search to prune paths after each exploration phase. Question Answering: The pruned paths are then evaluated for question answering. If these paths do not fully answer the question, the model explores deeper paths until $D_{max}$ is reached or moves on to the next exploration phase.
  • Figure 3: Overview of the initialization phase. Output 1: from the input question, the model identifies topic entities and prompts the LLM to decompose questions into split questions $q_{split}$ and generate an indicator $I_{LLM}$. The indicator outlines a strategy for formulating the answer and predicts the exploration depth $D_{predict}$. Output 2: the model queries the source KG up to $D_{max}$-hop from identified topic entities, constructing and pruning the evidence subgraph $\mathcal{G}_q$.
  • Figure 4: The accuracy of PoG and PoG-E among CWQ and WebQSP datasets by varying different $D_{\max}$.
  • Figure 5: The lengths of the ground-truth SPARQL queries within the CWQ and WebQSP datasets.
  • ...and 8 more figures

Theorems & Definitions (3)

  • Definition 1: Reasoning Path
  • Example 1
  • Definition 2: Entity Path