Table of Contents
Fetching ...

Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching

Songze Li, Zhiqiang Liu, Zhengke Gui, Huajun Chen, Wen Zhang

TL;DR

KGQA models still suffer from hallucinations due to a semantic gap between user queries and knowledge graphs. Enrich-on-Graph (EoG) addresses this by a three-stage framework (Parsing, Pruning, Enriching) that leverages LLM priors to generate query-aligned graphs, enabling efficient reasoning. The approach introduces three graph quality metrics—Relevance, Semantic Richness, and Redundancy—and provides theoretical justification via mutual information to connect these metrics to the optimization objective. Empirical results on WebQSP and CWQ demonstrate state-of-the-art performance with lower computational cost and strong plug-and-play adaptability across KGQA baselines.

Abstract

Large Language Models (LLMs) exhibit strong reasoning capabilities in complex tasks. However, they still struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA). We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures. Existing methods usually employ resource-intensive, non-scalable workflows reasoning on vanilla KGs, but overlook this gap. To address this challenge, we propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries. EoG enables efficient evidence extraction from KGs for precise and robust reasoning, while ensuring low computational costs, scalability, and adaptability across different methods. Furthermore, we propose three graph quality evaluation metrics to analyze query-graph alignment in KGQA task, supported by theoretical validation of our optimization objectives. Extensive experiments on two KGQA benchmark datasets indicate that EoG can effectively generate high-quality KGs and achieve the state-of-the-art performance. Our code and data are available at https://github.com/zjukg/Enrich-on-Graph.

Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching

TL;DR

KGQA models still suffer from hallucinations due to a semantic gap between user queries and knowledge graphs. Enrich-on-Graph (EoG) addresses this by a three-stage framework (Parsing, Pruning, Enriching) that leverages LLM priors to generate query-aligned graphs, enabling efficient reasoning. The approach introduces three graph quality metrics—Relevance, Semantic Richness, and Redundancy—and provides theoretical justification via mutual information to connect these metrics to the optimization objective. Empirical results on WebQSP and CWQ demonstrate state-of-the-art performance with lower computational cost and strong plug-and-play adaptability across KGQA baselines.

Abstract

Large Language Models (LLMs) exhibit strong reasoning capabilities in complex tasks. However, they still struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA). We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures. Existing methods usually employ resource-intensive, non-scalable workflows reasoning on vanilla KGs, but overlook this gap. To address this challenge, we propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries. EoG enables efficient evidence extraction from KGs for precise and robust reasoning, while ensuring low computational costs, scalability, and adaptability across different methods. Furthermore, we propose three graph quality evaluation metrics to analyze query-graph alignment in KGQA task, supported by theoretical validation of our optimization objectives. Extensive experiments on two KGQA benchmark datasets indicate that EoG can effectively generate high-quality KGs and achieve the state-of-the-art performance. Our code and data are available at https://github.com/zjukg/Enrich-on-Graph.

Paper Structure

This paper contains 48 sections, 2 theorems, 22 equations, 9 figures, 13 tables.

Key Result

Theorem 1

Maximizing the expected posterior probability is equivalent to maximizing the mutual information (MI) between $q$ and $G$.

Figures (9)

  • Figure 1: Semantic gap between query and graph: Gray indicates noise, red represents errors, orange denotes reasoning-related information, and green is answer. We use $G$, $Q$, $G^*$ to represent vanilla graph, query, query-aligned graph, respectively, and use their logic forms for illustration. Left: LLMs misextracts key information due to the semantic gap between $Q$ and $G$. Right: EoG generates $G^*$ for efficient LLM reasoning.
  • Figure 2: Left: focus mismatch and structure mismatch between query and vanilla graph cause semantic gaps. Right: Demonstration of EoG's alignment mechanism.
  • Figure 3: Overview of our Enrich-on-Graph framework.
  • Figure 4: Left: Comparison of graph quality metrics between EoG and other methods. Right: Validation of graph quality improvement by Prune and Enrich modules. Rel., Sem., and Red. indicate Relevance, Semantic richness, and Redundancy, respectively, and these metrics in the blue and orange respectively represent the results on the CWQ and WebQSP.
  • Figure 5: Graph visualization of EoG and advanced methods on the CWQ dataset. Left: t-SNE projection of Relevance. Middle: t-SNE visualization of Semantic Richness. Right: Comparison of Redundancy.
  • ...and 4 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2