Table of Contents
Fetching ...

Finetuning Generative Large Language Models with Discrimination Instructions for Knowledge Graph Completion

Yang Liu, Xiaobin Tian, Zequn Sun, Wei Hu

TL;DR

This work tackles knowledge graph completion with large language models while avoiding grounding errors by finetuning an open LLM using discrimination instructions. It combines candidate entities from a lightweight embedding model, truncated sampling to reduce data while preserving informative examples, and KG embeddings injected into the LLM to boost graph reasoning. The proposed DIFT framework achieves state-of-the-art results on FB15K-237 and WN18RR, outperforming both embedding-based and generation-based baselines while remaining computation-efficient via LoRA/QLoRA and candidate-based prompting. These findings demonstrate that discrimination-informed finetuning can unlock robust KG reasoning in LLMs with practical efficiency, guiding future work on KGQA and entity alignment.

Abstract

Traditional knowledge graph (KG) completion models learn embeddings to predict missing facts. Recent works attempt to complete KGs in a text-generation manner with large language models (LLMs). However, they need to ground the output of LLMs to KG entities, which inevitably brings errors. In this paper, we present a finetuning framework, DIFT, aiming to unleash the KG completion ability of LLMs and avoid grounding errors. Given an incomplete fact, DIFT employs a lightweight model to obtain candidate entities and finetunes an LLM with discrimination instructions to select the correct one from the given candidates. To improve performance while reducing instruction data, DIFT uses a truncated sampling method to select useful facts for finetuning and injects KG embeddings into the LLM. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed framework.

Finetuning Generative Large Language Models with Discrimination Instructions for Knowledge Graph Completion

TL;DR

This work tackles knowledge graph completion with large language models while avoiding grounding errors by finetuning an open LLM using discrimination instructions. It combines candidate entities from a lightweight embedding model, truncated sampling to reduce data while preserving informative examples, and KG embeddings injected into the LLM to boost graph reasoning. The proposed DIFT framework achieves state-of-the-art results on FB15K-237 and WN18RR, outperforming both embedding-based and generation-based baselines while remaining computation-efficient via LoRA/QLoRA and candidate-based prompting. These findings demonstrate that discrimination-informed finetuning can unlock robust KG reasoning in LLMs with practical efficiency, guiding future work on KGQA and entity alignment.

Abstract

Traditional knowledge graph (KG) completion models learn embeddings to predict missing facts. Recent works attempt to complete KGs in a text-generation manner with large language models (LLMs). However, they need to ground the output of LLMs to KG entities, which inevitably brings errors. In this paper, we present a finetuning framework, DIFT, aiming to unleash the KG completion ability of LLMs and avoid grounding errors. Given an incomplete fact, DIFT employs a lightweight model to obtain candidate entities and finetunes an LLM with discrimination instructions to select the correct one from the given candidates. To improve performance while reducing instruction data, DIFT uses a truncated sampling method to select useful facts for finetuning and injects KG embeddings into the LLM. Extensive experiments on benchmark datasets demonstrate the effectiveness of our proposed framework.
Paper Structure (22 sections, 4 equations, 4 figures, 6 tables)

This paper contains 22 sections, 4 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of the proposed DIFT framework.
  • Figure 2: Hits@1 results and training time of DIFT on FB15K-237 and WN18RR along with the numbers of candidate entities.
  • Figure 3: Hits@1 results and training time of DIFT on FB15K-237 and WN18RR along with the threshold for truncated sampling.
  • Figure 4: Correct predictions of DIFT and CoLE on FB15K-237 and WN18RR. The light blue area represents the accurate triplets predicted by DIFT, excluding those that can also be predicted by CoLE. The dark green area illustrates the overlapping triplets predicted accurately by DIFT and CoLE. The light green area represents the accurate triplets predicted by CoLE, excluding those that can also be predicted by DIFT.