Table of Contents
Fetching ...

MKGL: Mastery of a Three-Word Language

Lingbing Guo, Zhongpu Bo, Zhuo Chen, Yichi Zhang, Jiaoyan Chen, Yarong Lan, Mengshu Sun, Zhiqiang Zhang, Yangyifei Luo, Qian Li, Qiang Zhang, Wen Zhang, Huajun Chen

TL;DR

The results reveal that LLMs can achieve fluency in KGL, drastically reducing errors compared to conventional KG embedding methods on KG completion and shows exceptional competence in generating accurate three-word sentences from an initial entity and interpreting new unseen terms out of KGs.

Abstract

Large language models (LLMs) have significantly advanced performance across a spectrum of natural language processing (NLP) tasks. Yet, their application to knowledge graphs (KGs), which describe facts in the form of triplets and allow minimal hallucinations, remains an underexplored frontier. In this paper, we investigate the integration of LLMs with KGs by introducing a specialized KG Language (KGL), where a sentence precisely consists of an entity noun, a relation verb, and ends with another entity noun. Despite KGL's unfamiliar vocabulary to the LLM, we facilitate its learning through a tailored dictionary and illustrative sentences, and enhance context understanding via real-time KG context retrieval and KGL token embedding augmentation. Our results reveal that LLMs can achieve fluency in KGL, drastically reducing errors compared to conventional KG embedding methods on KG completion. Furthermore, our enhanced LLM shows exceptional competence in generating accurate three-word sentences from an initial entity and interpreting new unseen terms out of KGs.

MKGL: Mastery of a Three-Word Language

TL;DR

The results reveal that LLMs can achieve fluency in KGL, drastically reducing errors compared to conventional KG embedding methods on KG completion and shows exceptional competence in generating accurate three-word sentences from an initial entity and interpreting new unseen terms out of KGs.

Abstract

Large language models (LLMs) have significantly advanced performance across a spectrum of natural language processing (NLP) tasks. Yet, their application to knowledge graphs (KGs), which describe facts in the form of triplets and allow minimal hallucinations, remains an underexplored frontier. In this paper, we investigate the integration of LLMs with KGs by introducing a specialized KG Language (KGL), where a sentence precisely consists of an entity noun, a relation verb, and ends with another entity noun. Despite KGL's unfamiliar vocabulary to the LLM, we facilitate its learning through a tailored dictionary and illustrative sentences, and enhance context understanding via real-time KG context retrieval and KGL token embedding augmentation. Our results reveal that LLMs can achieve fluency in KGL, drastically reducing errors compared to conventional KG embedding methods on KG completion. Furthermore, our enhanced LLM shows exceptional competence in generating accurate three-word sentences from an initial entity and interpreting new unseen terms out of KGs.

Paper Structure

This paper contains 43 sections, 8 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: A workflow of MKGL (from bottom to top). The instruction to the LLM includes a dictionary exemplifying the entity $e_i$ and relation $r_k$. The task is to construct new KG sentences initialized with $e_i r_k$. The tokenizer first tokenizes the input text, where the entities and relations are represented as special tokens out of the original vocabulary. (a) To process these special tokens, MKGL collects the embeddings of their constituting text tokens; (b) Then, a retriever performs a 4-step process to aggregate textual and relational information into KGL token embeddings. The first and the last steps are LoRA-like down-scaling and up-scaling operations lora; (c) The output is assigned as the embeddings of these special KGL tokens; (d) Similar to the context retriever, we design a score retriever to retriever the score information. (f) The output is in a form of probability distribution among candidate entities.
  • Figure 2: Illustration of LoRA-based KGL Context Retriever. (a) The token embeddings are first scaled down to lower-dimensional vectors; (b) For each input KGL token, their constituting textual token embeddings are aggregated by a PNA encoder; (c) The output embeddings are further aggregated by multi-layered PNA encoders to retrieve neighboring information within KG; (e) The final embeddings are assigned to the KGL tokens.
  • Figure 3: Illustration of KGL modeling. The left shows the performance degradation (in lighter shades) from consecutive predictions of relations and entities. The right presents sentences generated by MKGL, with deeper hues indicating higher probabilities. In the final column, colors grey, green, yellow, and red represent existing, valid, valid but not within the KG, and invalid, respectively.
  • Figure 4: A comprehensive comparison between the methods with randomly-initialized new entity token embeddings (denoted by NewToken) and MKGL on FB15k-237. 1-hop and 2-hop are the versions leveraging the 1-hop and 2-hop KG neighboring information. MKGL w/o SR denotes MKGL without the score retriever.
  • Figure 5: More examples on KGL modeling.
  • ...and 2 more figures