Table of Contents
Fetching ...

A Survey of Knowledge Enhanced Pre-trained Language Models

Linmei Hu, Zeyi Liu, Ziwang Zhao, Lei Hou, Liqiang Nie, Juanzi Li

TL;DR

<3-5 sentence high-level summary>Knowledge-enhanced pre-trained language models aim to overcome PLMs' limited external knowledge and reasoning by integrating diverse knowledge sources. The paper proposes taxonomies for KE-PLMs across NLU and NLG, and synthesizes representative methods organized by knowledge type and fusion strategy. It covers linguistic, text, KG, and rule knowledge for NLU, and retrieval-based and KG-based approaches for NLG, discussing pre-training vs fine-tuning fusion. It concludes with future directions including multi-modal knowledge, continual learning, efficiency, and interpretability, highlighting the practical impact on robust, knowledgeable NLP systems.

Abstract

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.

A Survey of Knowledge Enhanced Pre-trained Language Models

TL;DR

<3-5 sentence high-level summary>Knowledge-enhanced pre-trained language models aim to overcome PLMs' limited external knowledge and reasoning by integrating diverse knowledge sources. The paper proposes taxonomies for KE-PLMs across NLU and NLG, and synthesizes representative methods organized by knowledge type and fusion strategy. It covers linguistic, text, KG, and rule knowledge for NLU, and retrieval-based and KG-based approaches for NLG, discussing pre-training vs fine-tuning fusion. It concludes with future directions including multi-modal knowledge, continual learning, efficiency, and interpretability, highlighting the practical impact on robust, knowledgeable NLP systems.

Abstract

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.
Paper Structure (22 sections, 8 figures, 4 tables)

This paper contains 22 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Taxonomy of Knowledge Enhanced Pre-trained Language Models (KE-PLMs) based on the two core tasks of NLP: Natural Language Understanding (NLU) and Natural Language Generation (NLG).
  • Figure 2: Categorization of Linguistic knowledge.
  • Figure 3: Categorization of knowledge graph and injection methods of each sub-category.
  • Figure 4: (a) Incorporating triplet knowledge through a knowledge fusion module. (b) KLMO designs a knowledge aggregator to fuse knowledge into the input token sequence. (c) KERM develops a knowledge injector to integrate knowledge explicitly.
  • Figure 5: Further categorization of retrieval-based method and KG-based method. The left figure demonstrates the categorization of retrieval-based method, and the right demonstrates the categorization of KG-based method.
  • ...and 3 more figures