JAKET: Joint Pre-training of Knowledge Graph and Language Understanding
Donghan Yu, Chenguang Zhu, Yiming Yang, Michael Zeng
TL;DR
JAKET tackles the challenge of integrating knowledge graphs into language understanding by jointly pre-training a knowledge module (KM) and a language module (LM). It introduces a two-stage language model to break the cyclic dependency with the KM, a relational GAT-based KM with compGCN-like fusion, and an entity context embedding memory to accelerate training. Self-supervised tasks across both modules enable joint embedding alignment and knowledge grounding, while fine-tuning supports unseen KGs. Empirical results across few-shot relation classification, KGQA, and unseen-KG entity classification show consistent gains over strong baselines, highlighting the method's cross-domain adaptability and efficiency.
Abstract
Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experimental results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.
