Table of Contents
Fetching ...

JAKET: Joint Pre-training of Knowledge Graph and Language Understanding

Donghan Yu, Chenguang Zhu, Yiming Yang, Michael Zeng

TL;DR

JAKET tackles the challenge of integrating knowledge graphs into language understanding by jointly pre-training a knowledge module (KM) and a language module (LM). It introduces a two-stage language model to break the cyclic dependency with the KM, a relational GAT-based KM with compGCN-like fusion, and an entity context embedding memory to accelerate training. Self-supervised tasks across both modules enable joint embedding alignment and knowledge grounding, while fine-tuning supports unseen KGs. Empirical results across few-shot relation classification, KGQA, and unseen-KG entity classification show consistent gains over strong baselines, highlighting the method's cross-domain adaptability and efficiency.

Abstract

Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experimental results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.

JAKET: Joint Pre-training of Knowledge Graph and Language Understanding

TL;DR

JAKET tackles the challenge of integrating knowledge graphs into language understanding by jointly pre-training a knowledge module (KM) and a language module (LM). It introduces a two-stage language model to break the cyclic dependency with the KM, a relational GAT-based KM with compGCN-like fusion, and an entity context embedding memory to accelerate training. Self-supervised tasks across both modules enable joint embedding alignment and knowledge grounding, while fine-tuning supports unseen KGs. Empirical results across few-shot relation classification, KGQA, and unseen-KG entity classification show consistent gains over strong baselines, highlighting the method's cross-domain adaptability and efficiency.

Abstract

Knowledge graphs (KGs) contain rich information about world knowledge, entities and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experimental results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.

Paper Structure

This paper contains 16 sections, 4 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: A simple illustration on the novelty of our proposed model JAKET.
  • Figure 2: A demonstration for the structure of JAKET, where the language module is on the left side marked green while the knowledge module is on the right side marked blue. Symbol Ⓧ indicates the steps to compute context representations introduced in Section \ref{['sec:cyclic']}. "QX", "PX" and "CX" are the indices for entities, relations and categories in KG respectively. Entity mentions in text are underlined and italicized such as Sun.