Table of Contents
Fetching ...

A Survey on Knowledge-Enhanced Pre-trained Language Models

Chaoqi Zhen, Yanlei Shang, Xiangyu Liu, Yifei Li, Yong Chen, Dell Zhang

TL;DR

This survey systematically maps the landscape of knowledge-enhanced pre-trained language models (KEPLMs), detailing knowledge types, formats, construction approaches (implicit vs explicit), evaluation metrics, and downstream applications. It clarifies how external knowledge from linguistic, semantic, commonsense, encyclopedic, and domain sources is integrated via graphs, text, or memory modules, and contrasts implicit masking and explicit fusion strategies. The paper highlights benchmarks such as LAMA/LAMA-UHN and KILT to assess knowledge capacity and task performance, and discusses trade-offs in model size and compute efficiency. It also outlines practical applications across NLU and NLG, and proposes directions for future work including more diverse knowledge sources, unified KEPLMs, and improved interpretability and robustness.

Abstract

Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs) such as BERT. Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks. By integrating external knowledge into PLMs, \textit{\underline{K}nowledge-\underline{E}nhanced \underline{P}re-trained \underline{L}anguage \underline{M}odels} (KEPLMs) have the potential to overcome the above-mentioned limitations. In this paper, we examine KEPLMs systematically through a series of studies. Specifically, we outline the common types and different formats of knowledge to be integrated into KEPLMs, detail the existing methods for building and evaluating KEPLMS, present the applications of KEPLMs in downstream tasks, and discuss the future research directions. Researchers will benefit from this survey by gaining a quick and comprehensive overview of the latest developments in this field.

A Survey on Knowledge-Enhanced Pre-trained Language Models

TL;DR

This survey systematically maps the landscape of knowledge-enhanced pre-trained language models (KEPLMs), detailing knowledge types, formats, construction approaches (implicit vs explicit), evaluation metrics, and downstream applications. It clarifies how external knowledge from linguistic, semantic, commonsense, encyclopedic, and domain sources is integrated via graphs, text, or memory modules, and contrasts implicit masking and explicit fusion strategies. The paper highlights benchmarks such as LAMA/LAMA-UHN and KILT to assess knowledge capacity and task performance, and discusses trade-offs in model size and compute efficiency. It also outlines practical applications across NLU and NLG, and proposes directions for future work including more diverse knowledge sources, unified KEPLMs, and improved interpretability and robustness.

Abstract

Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs) such as BERT. Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks. By integrating external knowledge into PLMs, \textit{\underline{K}nowledge-\underline{E}nhanced \underline{P}re-trained \underline{L}anguage \underline{M}odels} (KEPLMs) have the potential to overcome the above-mentioned limitations. In this paper, we examine KEPLMs systematically through a series of studies. Specifically, we outline the common types and different formats of knowledge to be integrated into KEPLMs, detail the existing methods for building and evaluating KEPLMS, present the applications of KEPLMs in downstream tasks, and discuss the future research directions. Researchers will benefit from this survey by gaining a quick and comprehensive overview of the latest developments in this field.
Paper Structure (54 sections, 2 equations, 12 figures, 8 tables)

This paper contains 54 sections, 2 equations, 12 figures, 8 tables.

Figures (12)

  • Figure 1: The benefits of integrating external knowledge into PLMs, according to ChatGPT --- one of the largest PLMs today.
  • Figure 2: Incorporating visual knowledge from captioned images into PLMs.
  • Figure 3: Using knowledge-guided masking strategies to build KEPLMs.
  • Figure 4: GLM's knowledge-graph informed sampling of entities for masking.
  • Figure 5: E-BERT's adaptive hybrid masking strategy.
  • ...and 7 more figures