KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

Jack Boylan; Shashank Mangla; Dominic Thorn; Demian Gholipour Ghalandari; Parsa Ghaffari; Chris Hokamp

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

Jack Boylan, Shashank Mangla, Dominic Thorn, Demian Gholipour Ghalandari, Parsa Ghaffari, Chris Hokamp

TL;DR

KG validation for knowledge-graph completion is hampered by open-world incompleteness and annotation costs. The authors propose KGValidator, a framework that uses LLMs with contextual evidence (LLM intrinsic knowledge, textual context, reference KGs like Wikidata, and web search) to validate candidate KG triples without gold references, leveraging Pydantic and the Instructor library for structured outputs. They demonstrate improvements in triple classification accuracy across multiple KG benchmarks and analyze how context variety affects performance, while candidly discussing limitations of current open-source LLMs and the need for broader tool support. The work offers a path toward scalable, context-grounded KG validation with practical implications for maintaining and updating large knowledge bases such as Wikidata.

Abstract

This study explores the use of Large Language Models (LLMs) for automatic evaluation of knowledge graph (KG) completion models. Historically, validating information in KGs has been a challenging task, requiring large-scale human annotation at prohibitive cost. With the emergence of general-purpose generative AI and LLMs, it is now plausible that human-in-the-loop validation could be replaced by a generative agent. We introduce a framework for consistency and validation when using generative models to validate knowledge graphs. Our framework is based upon recent open-source developments for structural and semantic validation of LLM outputs, and upon flexible approaches to fact checking and verification, supported by the capacity to reference external knowledge sources of any kind. The design is easy to adapt and extend, and can be used to verify any kind of graph-structured data through a combination of model-intrinsic knowledge, user-supplied context, and agents capable of external knowledge retrieval.

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

TL;DR

Abstract

Paper Structure (43 sections, 12 figures, 4 tables)

This paper contains 43 sections, 12 figures, 4 tables.

Introduction
Challenges and Paradigms in KG Completion Evaluation:
KGValidator Framework:
Background
Knowledge Graph Construction
LLMs and Knowledge Graphs
Knowledge Graph Construction Using Generative AI
Structuring and Validating Language Model Output
Knowledge-Grounded LLMs
Knowledge Graph Evaluation
Approach
Basic Settings for Validation:
Validation via Pydantic Models
Validation Contexts
Validating with LLM Knowledge
...and 28 more sections

Figures (12)

Figure 1: Framework for Validating Knowledge Graph Triples.
Figure 2: An example of the Closed-World Assumption in KG completion. Some of the triples predicted by a KG completion model are true in the real world (e.g. books written by James Joyce) but missing in the test set and would therefore be treated as false positives.
Figure 3: An example of Open Information Extraction. Note that in OpenIE, the output schema is not fixed.
Figure 4: Validating KGs with LLM Knowledge
Figure 5: Validating KGs given Textual Context
...and 7 more figures

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

TL;DR

Abstract

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

Authors

TL;DR

Abstract

Table of Contents

Figures (12)