Prompting Disentangled Embeddings for Knowledge Graph Completion with Pre-trained Language Model
Yuxia Geng, Jiaoyan Chen, Yuhang Zeng, Zhuo Chen, Wen Zhang, Jeff Z. Pan, Yuxiang Wang, Xiaoliang Xu
TL;DR
This paper tackles knowledge graph completion by marrying text and structure through prompt tuning on a frozen PLM. It introduces a hard task prompt to recast KGC as token prediction and a disentangled structure prompt to encode multi-aspect graph information via a learned graph learner, feeding both into a structure-aware text encoder. Two predictors—the textual one operating on PLM outputs and a structural one via a KGE model—are trained jointly and ensembled for robust tail prediction; a mutual-information regularizer promotes component independence. Across WN18RR, FB15K-237, and CoDEx-L, PDKGC outperforms structure-based, PLM-based, and existing joint methods, with notable gains on datasets with complex neighborhoods. This approach offers improved efficiency and scalability by keeping the PLM frozen while exploiting targeted prompts and disentangled graph semantics, paving the way for larger frozen models and inductive KGC extensions.
Abstract
Both graph structures and textual information play a critical role in Knowledge Graph Completion (KGC). With the success of Pre-trained Language Models (PLMs) such as BERT, they have been applied for text encoding for KGC. However, the current methods mostly prefer to fine-tune PLMs, leading to huge training costs and limited scalability to larger PLMs. In contrast, we propose to utilize prompts and perform KGC on a frozen PLM with only the prompts trained. Accordingly, we propose a new KGC method named PDKGC with two prompts -- a hard task prompt which is to adapt the KGC task to the PLM pre-training task of token prediction, and a disentangled structure prompt which learns disentangled graph representation so as to enable the PLM to combine more relevant structure knowledge with the text information. With the two prompts, PDKGC builds a textual predictor and a structural predictor, respectively, and their combination leads to more comprehensive entity prediction. Solid evaluation on three widely used KGC datasets has shown that PDKGC often outperforms the baselines including the state-of-the-art, and its components are all effective. Our codes and data are available at https://github.com/genggengcss/PDKGC.
