Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering

Jiaxiang Liu; Tong Zhou; Yubo Chen; Kang Liu; Jun Zhao

Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering

Jiaxiang Liu, Tong Zhou, Yubo Chen, Kang Liu, Jun Zhao

TL;DR

The paper tackles factual hallucinations in large language models (LLMs) for open-ended QA by proposing PG&AKV, a framework that fuses Pseudo-Graph Generation with Atomic Knowledge Verification to leverage multisource knowledge graphs (KGs). It constructs a pseudo-graph $G_p$ via LLM-driven Cypher queries, then applies semantic querying and two-stage pruning to derive a ground-truth graph $G_g$ and a fixed graph $G_f$ for robust answer generation. The approach demonstrates consistent improvements over baselines on diverse datasets (SimpleQuestions, QALD-10, Nature Questions) across GPT-3.5 and GPT-4, with notable gains in ROUGE-L and Hit@1, and shows strong generalization across KG sources. Overall, PG&AKV offers a practical, KG-agnostic path to mitigate hallucinations and improve open-ended QA in real-world applications by integrating pseudo-knowledge generation with principled verification across multiple knowledge graphs.

Abstract

Mitigating the hallucinations of Large Language Models is a crucial task. Although some existing methods employ self-enhancement techniques, they fall short of effectively addressing unknown factual hallucinations. Meanwhile, Knowledge Graph (KG) enhancement approaches fail to address the generalization across different KG sources and the enhancement of open-ended answer questions simultaneously. To tackle these limitations, we propose a framework that combines Pseudo-Graph Generation and Atomic Knowledge Verification (PG\&AKV). Enhancement of open-ended question-answering begins with leveraging the Pseudo-Graph Generation to provide the related knowledge framework. Subsequently, Atomic Knowledge Verification utilizes atomic-level knowledge querying and verification to achieve generalizability under different KG sources. Compared to the baseline, this approach yields a minimum improvement of 11.5 in the ROUGE-L score for open-ended questions. For precise-answered questions, we observe a minimum accuracy improvement of 7.5%. Moreover, PG\&AKV also exhibits generalizability across different KG sources. Utilizing KG different from the question sources, PG\&AKV can even achieve at least a 3.5 % performance improvement. In summary, our results pave the way for enhancing LLMs by incorporating Pseudo- and Multisource-KGs, particularly in the filed of open-ended questions.

Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering

TL;DR

via LLM-driven Cypher queries, then applies semantic querying and two-stage pruning to derive a ground-truth graph

and a fixed graph

for robust answer generation. The approach demonstrates consistent improvements over baselines on diverse datasets (SimpleQuestions, QALD-10, Nature Questions) across GPT-3.5 and GPT-4, with notable gains in ROUGE-L and Hit@1, and shows strong generalization across KG sources. Overall, PG&AKV offers a practical, KG-agnostic path to mitigate hallucinations and improve open-ended QA in real-world applications by integrating pseudo-knowledge generation with principled verification across multiple knowledge graphs.

Abstract

Paper Structure (21 sections, 5 figures, 5 tables)

This paper contains 21 sections, 5 figures, 5 tables.

Introduction
Related Work
Self Enhanced LLMs
KG Enhanced LLMs
Methodology
Pseudo-Graph Generation
Atomic Knowledge Verification
Sematic Query
Pseudo-Graph Verification
Answer Generation
Experimens
Models
Datasets
Baselines
Main Results
...and 6 more sections

Figures (5)

Figure 1: The over view of PG&AKV: Pseudo-Graph Generation: In step 1, we prompt LLM to generate pseudo-graph $G_p$ related to the question. Atomic Knowledge Verification: For step 2, the pseudo-triples extracted from $G_p$ are used to query a semantic KG, yielding the ground truth graph ($G_g$). In step 3, the LLM verifies $G_p$, which leads to the fixed graph ($G_f$).
Figure 2: Pipline of the generation of pseudo-graph.
Figure 3: Prompt for pseudo-graph generation. We partially omit the section involving generated code due to the large number of lines it occupies.
Figure 4: Prompt for Pseudo-Graph Verification.
Figure 5: Prompt for answer generation.

Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering

TL;DR

Abstract

Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering

Authors

TL;DR

Abstract

Table of Contents

Figures (5)