Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models
Derong Xu, Ziheng Zhang, Zhenxi Lin, Xian Wu, Zhihong Zhu, Tong Xu, Xiangyu Zhao, Yefeng Zheng, Enhong Chen
TL;DR
Knowledge graph completion often suffers from incomplete graphs and limited textual descriptions. The paper introduces MPIKGC, a general framework that augments description-based KGC by querying large language models from three perspectives: Description Expansion, Relation Understanding, and Structure Extraction. Through chain-of-thought prompts, global/local/reverse relation prompts, and keyword-based structural augmentation (with SameAs edges), MPIKGC enhances entity descriptions, clarifies relation semantics, and enriches graph structure, improving performance across link prediction and triplet classification on multiple datasets and with several base models. Extensive ablations and cross-LLM analyses demonstrate the framework’s universality and potential for broad applicability, while acknowledging challenges like hallucination that warrant controlled prompting or fine-tuning in future work.
Abstract
Knowledge graph completion (KGC) is a widely used method to tackle incompleteness in knowledge graphs (KGs) by making predictions for missing links. Description-based KGC leverages pre-trained language models to learn entity and relation representations with their names or descriptions, which shows promising results. However, the performance of description-based KGC is still limited by the quality of text and the incomplete structure, as it lacks sufficient entity descriptions and relies solely on relation names, leading to sub-optimal results. To address this issue, we propose MPIKGC, a general framework to compensate for the deficiency of contextualized knowledge and improve KGC by querying large language models (LLMs) from various perspectives, which involves leveraging the reasoning, explanation, and summarization capabilities of LLMs to expand entity descriptions, understand relations, and extract structures, respectively. We conducted extensive evaluation of the effectiveness and improvement of our framework based on four description-based KGC models and four datasets, for both link prediction and triplet classification tasks.
