Enhancing Large Language Models with Pseudo- and Multisource- Knowledge Graphs for Open-ended Question Answering
Jiaxiang Liu, Tong Zhou, Yubo Chen, Kang Liu, Jun Zhao
TL;DR
The paper tackles factual hallucinations in large language models (LLMs) for open-ended QA by proposing PG&AKV, a framework that fuses Pseudo-Graph Generation with Atomic Knowledge Verification to leverage multisource knowledge graphs (KGs). It constructs a pseudo-graph $G_p$ via LLM-driven Cypher queries, then applies semantic querying and two-stage pruning to derive a ground-truth graph $G_g$ and a fixed graph $G_f$ for robust answer generation. The approach demonstrates consistent improvements over baselines on diverse datasets (SimpleQuestions, QALD-10, Nature Questions) across GPT-3.5 and GPT-4, with notable gains in ROUGE-L and Hit@1, and shows strong generalization across KG sources. Overall, PG&AKV offers a practical, KG-agnostic path to mitigate hallucinations and improve open-ended QA in real-world applications by integrating pseudo-knowledge generation with principled verification across multiple knowledge graphs.
Abstract
Mitigating the hallucinations of Large Language Models is a crucial task. Although some existing methods employ self-enhancement techniques, they fall short of effectively addressing unknown factual hallucinations. Meanwhile, Knowledge Graph (KG) enhancement approaches fail to address the generalization across different KG sources and the enhancement of open-ended answer questions simultaneously. To tackle these limitations, we propose a framework that combines Pseudo-Graph Generation and Atomic Knowledge Verification (PG\&AKV). Enhancement of open-ended question-answering begins with leveraging the Pseudo-Graph Generation to provide the related knowledge framework. Subsequently, Atomic Knowledge Verification utilizes atomic-level knowledge querying and verification to achieve generalizability under different KG sources. Compared to the baseline, this approach yields a minimum improvement of 11.5 in the ROUGE-L score for open-ended questions. For precise-answered questions, we observe a minimum accuracy improvement of 7.5%. Moreover, PG\&AKV also exhibits generalizability across different KG sources. Utilizing KG different from the question sources, PG\&AKV can even achieve at least a 3.5 % performance improvement. In summary, our results pave the way for enhancing LLMs by incorporating Pseudo- and Multisource-KGs, particularly in the filed of open-ended questions.
