Knowledge Graph-Enhanced Large Language Models via Path Selection
Haochen Liu, Song Wang, Yaochen Zhu, Yushun Dong, Jundong Li
TL;DR
This work defines $G=(\mathcal{E},\mathcal{R},\mathcal{T})$ and addresses factual inaccuracies in LLM outputs by augmenting prompts with knowledge paths extracted from external KGs. It introduces KELP, a three-stage framework comprising Knowledge path extraction, Sample encoding, and Fine-grained path selection, augmented by a latent semantic path-text encoder and two coverage rules to capture both direct and indirect semantics; an optional Relation-Only Ranking scales the approach to very large KGs. A pairwise training objective optimizes the path-text encoder to align with potentially impactful knowledge, and extensive experiments on MetaQA and FACTKG show KELP surpassing LLM-based evidence baselines and approaching fully supervised baselines in many settings. The approach demonstrates practical improvements in factual accuracy with robust performance in few-shot regimes, offering a scalable, flexible tool for KG-Enhanced LLMs with real-world applicability in QA and fact verification.
Abstract
Large Language Models (LLMs) have shown unprecedented performance in various real-world applications. However, they are known to generate factually inaccurate outputs, a.k.a. the hallucination problem. In recent years, incorporating external knowledge extracted from Knowledge Graphs (KGs) has become a promising strategy to improve the factual accuracy of LLM-generated outputs. Nevertheless, most existing explorations rely on LLMs themselves to perform KG knowledge extraction, which is highly inflexible as LLMs can only provide binary judgment on whether a certain knowledge (e.g., a knowledge path in KG) should be used. In addition, LLMs tend to pick only knowledge with direct semantic relationship with the input text, while potentially useful knowledge with indirect semantics can be ignored. In this work, we propose a principled framework KELP with three stages to handle the above problems. Specifically, KELP is able to achieve finer granularity of flexible knowledge extraction by generating scores for knowledge paths with input texts via latent semantic matching. Meanwhile, knowledge paths with indirect semantic relationships with the input text can also be considered via trained encoding between the selected paths in KG and the input text. Experiments on real-world datasets validate the effectiveness of KELP.
