Table of Contents
Fetching ...

Mining the Mind: What 100M Beliefs Reveal About Frontier LLM Knowledge

Shrestha Ghosh, Luca Giordano, Yujia Hu, Tuan-Phong Nguyen, Simon Razniewski

TL;DR

This work analyzes the factual knowledge encoded in a frontier LLM by leveraging GPTKB v1.5, a large-scale, recursively elicited knowledge base derived from GPT-4.1 containing over 100M factual assertions. It demonstrates that the model stores vast knowledge with biases that differ from traditional knowledge bases, achieving about 75% factual accuracy and showing substantial inconsistency and hallucinations, particularly in dynamic or politically charged domains. The authors validate the approach by surveying size, taxonomy, language distribution, and literals, and they reveal notable biases (gender, geography) and robust multilingual footprints, while highlighting timeliness through recency signals. The study provides a careful, scalable methodology for probing closed-source models and discusses implications for real-time data integration, bias mitigation, and future factuality research in LLMs.

Abstract

LLMs are remarkable artifacts that have revolutionized a range of NLP and AI tasks. A significant contributor is their factual knowledge, which, to date, remains poorly understood, and is usually analyzed from biased samples. In this paper, we take a deep tour into the factual knowledge (or beliefs) of a frontier LLM, based on GPTKB v1.5 (Hu et al., 2025a), a recursively elicited set of 100 million beliefs of one of the strongest currently available frontier LLMs, GPT-4.1. We find that the models' factual knowledge differs quite significantly from established knowledge bases, and that its accuracy is significantly lower than indicated by previous benchmarks. We also find that inconsistency, ambiguity and hallucinations are major issues, shedding light on future research opportunities concerning factual LLM knowledge.

Mining the Mind: What 100M Beliefs Reveal About Frontier LLM Knowledge

TL;DR

This work analyzes the factual knowledge encoded in a frontier LLM by leveraging GPTKB v1.5, a large-scale, recursively elicited knowledge base derived from GPT-4.1 containing over 100M factual assertions. It demonstrates that the model stores vast knowledge with biases that differ from traditional knowledge bases, achieving about 75% factual accuracy and showing substantial inconsistency and hallucinations, particularly in dynamic or politically charged domains. The authors validate the approach by surveying size, taxonomy, language distribution, and literals, and they reveal notable biases (gender, geography) and robust multilingual footprints, while highlighting timeliness through recency signals. The study provides a careful, scalable methodology for probing closed-source models and discusses implications for real-time data integration, bias mitigation, and future factuality research in LLMs.

Abstract

LLMs are remarkable artifacts that have revolutionized a range of NLP and AI tasks. A significant contributor is their factual knowledge, which, to date, remains poorly understood, and is usually analyzed from biased samples. In this paper, we take a deep tour into the factual knowledge (or beliefs) of a frontier LLM, based on GPTKB v1.5 (Hu et al., 2025a), a recursively elicited set of 100 million beliefs of one of the strongest currently available frontier LLMs, GPT-4.1. We find that the models' factual knowledge differs quite significantly from established knowledge bases, and that its accuracy is significantly lower than indicated by previous benchmarks. We also find that inconsistency, ambiguity and hallucinations are major issues, shedding light on future research opportunities concerning factual LLM knowledge.

Paper Structure

This paper contains 41 sections, 9 figures, 12 tables.

Figures (9)

  • Figure 1: Language distribution in GPTKB.
  • Figure 2: Gender skew in professions and nationalities
  • Figure 3: Trend of triple accuracy across layers.
  • Figure 4: Trend of entity verifiability across layers.
  • Figure 5: Percentage of total triples present symetrically for the 20 most frequent symmetric relations.
  • ...and 4 more figures