Table of Contents
Fetching ...

Epistemology of Language Models: Do Language Models Have Holistic Knowledge?

Minsu Kim, James Thorne

TL;DR

Whether LLMs exhibit characteristics consistent with epistemological holism are explored, which suggest that core knowledge, such as general scientific knowledge, each plays a specific role, serving as the foundation of the authors' knowledge system and being difficult to revise.

Abstract

This paper investigates the inherent knowledge in language models from the perspective of epistemological holism. The purpose of this paper is to explore whether LLMs exhibit characteristics consistent with epistemological holism. These characteristics suggest that core knowledge, such as general scientific knowledge, each plays a specific role, serving as the foundation of our knowledge system and being difficult to revise. To assess these traits related to holism, we created a scientific reasoning dataset and examined the epistemology of language models through three tasks: Abduction, Revision, and Argument Generation. In the abduction task, the language models explained situations while avoiding revising the core knowledge. However, in other tasks, the language models were revealed not to distinguish between core and peripheral knowledge, showing an incomplete alignment with holistic knowledge principles.

Epistemology of Language Models: Do Language Models Have Holistic Knowledge?

TL;DR

Whether LLMs exhibit characteristics consistent with epistemological holism are explored, which suggest that core knowledge, such as general scientific knowledge, each plays a specific role, serving as the foundation of the authors' knowledge system and being difficult to revise.

Abstract

This paper investigates the inherent knowledge in language models from the perspective of epistemological holism. The purpose of this paper is to explore whether LLMs exhibit characteristics consistent with epistemological holism. These characteristics suggest that core knowledge, such as general scientific knowledge, each plays a specific role, serving as the foundation of our knowledge system and being difficult to revise. To assess these traits related to holism, we created a scientific reasoning dataset and examined the epistemology of language models through three tasks: Abduction, Revision, and Argument Generation. In the abduction task, the language models explained situations while avoiding revising the core knowledge. However, in other tasks, the language models were revealed not to distinguish between core and peripheral knowledge, showing an incomplete alignment with holistic knowledge principles.
Paper Structure (37 sections, 5 equations, 8 figures, 23 tables)

This paper contains 37 sections, 5 equations, 8 figures, 23 tables.

Figures (8)

  • Figure 1: A diagram of the holistic web of belief. At the core, there are certain pieces of knowledge that serve as the basis of our beliefs. while towards the periphery, less certain empirical knowledge is located. In this web, all knowledge is revisable, but when we encounter new experiences, the peripheral knowledge is more prone to revision than that at the core.
  • Figure 2: Introduction of three main tasks. The abduction task is a preference task that seeks to investigate whether LLMs favor abductive explanations over negating core statements. The Argument generation task aims to explore the capability of language models to produce holistic arguments. The revision task is designed to find out whether language models, when faced with counterexamples, prefer to modify peripheral knowledge or instead opt to alter core knowledge.
  • Figure 3: The argument of Duhem–Quine thesis. In the hypotheses, implicit assumptions are interconnected with explicit scientific facts. When an observation contradicts a scientific fact, it challenges both the fact and related statements. The conclusion of this process is indeterministic, meaning we can either negate the scientific fact or other implicit propositions. However, as most scientific facts are the basis of our web of belief, we often end up negating auxiliary hypotheses or observational conditions.
  • Figure 4: "Ab_observ" involves a comparison between negating a general fact and negating an observation fact. On the other hand, "Ab_etc" contrasts the negation of a general fact with the utilization of other peripheral facts. "Rev_observ" is a task that involves deciding which needs to be modified between a general fact and the claim that an observation is valid. "Rev_etc", on the other hand, is a task that determines what needs to be revised between a general fact and the absence of other hypothetical conditions.
  • Figure 5: Changes in the success rate of knowledge edition over epochs during Knowledge Edit Supervised-Finetuning. (a) involves training on the negation of factual knowledge, while (b) and (c) involve fine-tuning the negation of general knowledge and the counter-observation of general knowledge, respectively. The success rate in (a) is the proportion at which the model negates the trained factual knowledge, and in (b) and (c), it is the rate at which general knowledge is answered as false.
  • ...and 3 more figures