Table of Contents
Fetching ...

Large Knowledge Model: Perspectives and Challenges

Huajun Chen

TL;DR

The paper surveys how knowledge graphs and large language models can be integrated to overcome each other’s limitations: KGs provide precise, interpretable structure while LLMs offer broad, versatile knowledge coverage. It discusses methods to augment LLMs with knowledge (structural pretraining, KG-informed prompts, KG-RAG, and editing) and how LLMs can enhance symbolic KBs (KG building, querying, and reasoning). It then advocates a shift toward Large Knowledge Models (LKMs) via decoupling knowledge from language, cognitive-alignment strategies, perception-cognition integration, and a five-A principle (Augmented, Authentic, Accountable, Abundant, Aligned). The work highlights practical pathways to robust, hallucination-resilient, and more human-aligned AI capable of interacting with the real world across modalities and domains.

Abstract

Humankind's understanding of the world is fundamentally linked to our perception and cognition, with \emph{human languages} serving as one of the major carriers of \emph{world knowledge}. In this vein, \emph{Large Language Models} (LLMs) like ChatGPT epitomize the pre-training of extensive, sequence-based world knowledge into neural networks, facilitating the processing and manipulation of this knowledge in a parametric space. This article explores large models through the lens of "knowledge". We initially investigate the role of symbolic knowledge such as Knowledge Graphs (KGs) in enhancing LLMs, covering aspects like knowledge-augmented language model, structure-inducing pre-training, knowledgeable prompts, structured CoT, knowledge editing, semantic tools for LLM and knowledgeable AI agents. Subsequently, we examine how LLMs can boost traditional symbolic knowledge bases, encompassing aspects like using LLM as KG builder and controller, structured knowledge pretraining, and LLM-enhanced symbolic reasoning. Considering the intricate nature of human knowledge, we advocate for the creation of \emph{Large Knowledge Models} (LKM), specifically engineered to manage diversified spectrum of knowledge structures. This promising undertaking would entail several key challenges, such as disentangling knowledge base from language models, cognitive alignment with human knowledge, integration of perception and cognition, and building large commonsense models for interacting with physical world, among others. We finally propose a five-"A" principle to distinguish the concept of LKM.

Large Knowledge Model: Perspectives and Challenges

TL;DR

The paper surveys how knowledge graphs and large language models can be integrated to overcome each other’s limitations: KGs provide precise, interpretable structure while LLMs offer broad, versatile knowledge coverage. It discusses methods to augment LLMs with knowledge (structural pretraining, KG-informed prompts, KG-RAG, and editing) and how LLMs can enhance symbolic KBs (KG building, querying, and reasoning). It then advocates a shift toward Large Knowledge Models (LKMs) via decoupling knowledge from language, cognitive-alignment strategies, perception-cognition integration, and a five-A principle (Augmented, Authentic, Accountable, Abundant, Aligned). The work highlights practical pathways to robust, hallucination-resilient, and more human-aligned AI capable of interacting with the real world across modalities and domains.

Abstract

Humankind's understanding of the world is fundamentally linked to our perception and cognition, with \emph{human languages} serving as one of the major carriers of \emph{world knowledge}. In this vein, \emph{Large Language Models} (LLMs) like ChatGPT epitomize the pre-training of extensive, sequence-based world knowledge into neural networks, facilitating the processing and manipulation of this knowledge in a parametric space. This article explores large models through the lens of "knowledge". We initially investigate the role of symbolic knowledge such as Knowledge Graphs (KGs) in enhancing LLMs, covering aspects like knowledge-augmented language model, structure-inducing pre-training, knowledgeable prompts, structured CoT, knowledge editing, semantic tools for LLM and knowledgeable AI agents. Subsequently, we examine how LLMs can boost traditional symbolic knowledge bases, encompassing aspects like using LLM as KG builder and controller, structured knowledge pretraining, and LLM-enhanced symbolic reasoning. Considering the intricate nature of human knowledge, we advocate for the creation of \emph{Large Knowledge Models} (LKM), specifically engineered to manage diversified spectrum of knowledge structures. This promising undertaking would entail several key challenges, such as disentangling knowledge base from language models, cognitive alignment with human knowledge, integration of perception and cognition, and building large commonsense models for interacting with physical world, among others. We finally propose a five-"A" principle to distinguish the concept of LKM.
Paper Structure (39 sections, 13 figures)

This paper contains 39 sections, 13 figures.

Figures (13)

  • Figure 1: Language, Knowledge, Language Models and Knowledge Graphs.
  • Figure 2: An Outline of the Whole Content Structure
  • Figure 3: The Dichotomy in Knowledge Representation
  • Figure 4: Knowledge Injection on Different Layers.
  • Figure 5: Structure-Augmented Pre-training
  • ...and 8 more figures