Table of Contents
Fetching ...

Prompting Disentangled Embeddings for Knowledge Graph Completion with Pre-trained Language Model

Yuxia Geng, Jiaoyan Chen, Yuhang Zeng, Zhuo Chen, Wen Zhang, Jeff Z. Pan, Yuxiang Wang, Xiaoliang Xu

TL;DR

This paper tackles knowledge graph completion by marrying text and structure through prompt tuning on a frozen PLM. It introduces a hard task prompt to recast KGC as token prediction and a disentangled structure prompt to encode multi-aspect graph information via a learned graph learner, feeding both into a structure-aware text encoder. Two predictors—the textual one operating on PLM outputs and a structural one via a KGE model—are trained jointly and ensembled for robust tail prediction; a mutual-information regularizer promotes component independence. Across WN18RR, FB15K-237, and CoDEx-L, PDKGC outperforms structure-based, PLM-based, and existing joint methods, with notable gains on datasets with complex neighborhoods. This approach offers improved efficiency and scalability by keeping the PLM frozen while exploiting targeted prompts and disentangled graph semantics, paving the way for larger frozen models and inductive KGC extensions.

Abstract

Both graph structures and textual information play a critical role in Knowledge Graph Completion (KGC). With the success of Pre-trained Language Models (PLMs) such as BERT, they have been applied for text encoding for KGC. However, the current methods mostly prefer to fine-tune PLMs, leading to huge training costs and limited scalability to larger PLMs. In contrast, we propose to utilize prompts and perform KGC on a frozen PLM with only the prompts trained. Accordingly, we propose a new KGC method named PDKGC with two prompts -- a hard task prompt which is to adapt the KGC task to the PLM pre-training task of token prediction, and a disentangled structure prompt which learns disentangled graph representation so as to enable the PLM to combine more relevant structure knowledge with the text information. With the two prompts, PDKGC builds a textual predictor and a structural predictor, respectively, and their combination leads to more comprehensive entity prediction. Solid evaluation on three widely used KGC datasets has shown that PDKGC often outperforms the baselines including the state-of-the-art, and its components are all effective. Our codes and data are available at https://github.com/genggengcss/PDKGC.

Prompting Disentangled Embeddings for Knowledge Graph Completion with Pre-trained Language Model

TL;DR

This paper tackles knowledge graph completion by marrying text and structure through prompt tuning on a frozen PLM. It introduces a hard task prompt to recast KGC as token prediction and a disentangled structure prompt to encode multi-aspect graph information via a learned graph learner, feeding both into a structure-aware text encoder. Two predictors—the textual one operating on PLM outputs and a structural one via a KGE model—are trained jointly and ensembled for robust tail prediction; a mutual-information regularizer promotes component independence. Across WN18RR, FB15K-237, and CoDEx-L, PDKGC outperforms structure-based, PLM-based, and existing joint methods, with notable gains on datasets with complex neighborhoods. This approach offers improved efficiency and scalability by keeping the PLM frozen while exploiting targeted prompts and disentangled graph semantics, paving the way for larger frozen models and inductive KGC extensions.

Abstract

Both graph structures and textual information play a critical role in Knowledge Graph Completion (KGC). With the success of Pre-trained Language Models (PLMs) such as BERT, they have been applied for text encoding for KGC. However, the current methods mostly prefer to fine-tune PLMs, leading to huge training costs and limited scalability to larger PLMs. In contrast, we propose to utilize prompts and perform KGC on a frozen PLM with only the prompts trained. Accordingly, we propose a new KGC method named PDKGC with two prompts -- a hard task prompt which is to adapt the KGC task to the PLM pre-training task of token prediction, and a disentangled structure prompt which learns disentangled graph representation so as to enable the PLM to combine more relevant structure knowledge with the text information. With the two prompts, PDKGC builds a textual predictor and a structural predictor, respectively, and their combination leads to more comprehensive entity prediction. Solid evaluation on three widely used KGC datasets has shown that PDKGC often outperforms the baselines including the state-of-the-art, and its components are all effective. Our codes and data are available at https://github.com/genggengcss/PDKGC.
Paper Structure (28 sections, 12 equations, 4 figures, 6 tables)

This paper contains 28 sections, 12 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: An example of the KG entity "David Beckham" which is associated with neighboring entities of different aspects (e.g., "family", "career") by different kinds of relations.
  • Figure 2: The framework overview of our proposed PDKGC, with the KG triple example (David Beckham, has son, ?) to complete. It includes (1) a disentangled graph learner that learns structure semantics of different $K$ aspects for each entity, here we take $K=3$ as an example, (2) a structure-aware text encoder that encodes the triple text together with a set of prefix prompts generated from disentangled structural embeddings, i.e., our proposed disentangled structure prompt with $\bm{p}^0$ and $\bm{p}^L$ at the first and last layers of the frozen PLM, respectively, and (3) two predictors that respectively take as input the structure-augmented textual encoding, i.e., $\texttt{[MASK]}$ token's hidden vector at PLMs' final layer $\bm{w}_{MASK}^L$, and the text-augmented structural encoding, i.e., the structural prompts at PLM's last layer $\bm{p}_{h,1}^{L}, \bm{p}_{h,2}^{L}, \bm{p}_{h,3}^{L}$ and $\bm{p}_{r}^L$, for predicting the probabilities (scores) that $t$ is the correct tail entity, i.e., $Q^S_t$ and $Q^T_t$, which can be further fused to output a final score. Notably, $\bm{w}_{h,i}^0$ is the input embedding of the $i$-th token in the head entity $h$'s textual names and descriptions, while $\bm{p}_{h,k}^{0}$ represents the input token sequence embedding corresponding to $h$'s $k$-th disentangled embedding.
  • Figure 3: Performance (MRR) of different models on different testing triples of FB15K-237, which have different numbers of surrounding entities. "PDKGC [T]" and "PDKGC (single) [T]" means the results from PDKGC and $\text{PDKGC}_{\text{single}}$'s textual predictors, respectively.
  • Figure 4: Two triples to complete from FB15K-237's testing set. The ranks predicted by different models are highlighted using green background. For PDKGC and $\text{PDKGC}_{\text{single}}$, we report the ranks produced by their textual predictors, and also present the disentangled components of the head entity including the correlation weight between the component and the target triple, and the component's corresponding Top-2 neighbors.