Table of Contents
Fetching ...

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

Yanbin Wei, Qiushi Huang, James T. Kwok, Yu Zhang

TL;DR

KICGPT tackles knowledge graph completion by fusing a traditional triple-based retriever with a large language model via Knowledge Prompt, enabling effective in-context reasoning without LLM finetuning. The method uses two demonstration pools (analogy and supplement) and carefully ordered prompts to guide the LLM to re-rank a top subset of candidate entities produced by the retriever. Empirical results on FB15k-237 and WN18RR show state-of-the-art or competitive performance across metrics with lower training overhead, and ablations confirm the value of demonstration design and prompt engineering. The approach notably improves handling of long-tail entities by leveraging the LLM's broad knowledge base in conjunction with structural KG cues, offering a practical, training-free path for scalable KGC tasks.

Abstract

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based methods alleviate this issue but require costly training for language models and specific finetuning for knowledge graphs, which limits their efficiency. To alleviate these limitations, in this paper, we propose KICGPT, a framework that integrates a large language model (LLM) and a triple-based KGC retriever. It alleviates the long-tail problem without incurring additional training overhead. KICGPT uses an in-context learning strategy called Knowledge Prompt, which encodes structural knowledge into demonstrations to guide the LLM. Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning.

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

TL;DR

KICGPT tackles knowledge graph completion by fusing a traditional triple-based retriever with a large language model via Knowledge Prompt, enabling effective in-context reasoning without LLM finetuning. The method uses two demonstration pools (analogy and supplement) and carefully ordered prompts to guide the LLM to re-rank a top subset of candidate entities produced by the retriever. Empirical results on FB15k-237 and WN18RR show state-of-the-art or competitive performance across metrics with lower training overhead, and ablations confirm the value of demonstration design and prompt engineering. The approach notably improves handling of long-tail entities by leveraging the LLM's broad knowledge base in conjunction with structural KG cues, offering a practical, training-free path for scalable KGC tasks.

Abstract

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based methods alleviate this issue but require costly training for language models and specific finetuning for knowledge graphs, which limits their efficiency. To alleviate these limitations, in this paper, we propose KICGPT, a framework that integrates a large language model (LLM) and a triple-based KGC retriever. It alleviates the long-tail problem without incurring additional training overhead. KICGPT uses an in-context learning strategy called Knowledge Prompt, which encodes structural knowledge into demonstrations to guide the LLM. Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning.
Paper Structure (29 sections, 3 figures, 8 tables)

This paper contains 29 sections, 3 figures, 8 tables.

Figures (3)

  • Figure 1: An illustration of the KICGPT framework.
  • Figure 2: Illustration of a multi-round interaction with the LLM. Stage 3 is repeated many times to provide more demonstrations.
  • Figure 3: Average performance of different models (in terms of Hits@1 and Hits@10) grouped by the logarithm of entity degrees on the FB15k-237 dataset.