Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema
Xiaohan Feng, Xixin Wu, Helen Meng
TL;DR
The paper addresses scalable knowledge graph construction for proprietary domains by grounding LLM-generated knowledge in an ontology aligned with Wikidata. It introduces a four-stage pipeline—competency-question generation, ontology matching to Wikidata, ontology formatting, and KG construction—to ensure consistency, interpretability, and interoperability. Experimental results on Wiki-NRE, SciERC, and WebNLG demonstrate competitive performance, with notable gains under target-schema constraints and insights into the trade-offs of unconstrained ontology expansion. The approach provides an interpretable, auditable QA-friendly KG construction framework that can integrate with Wikidata and scale with limited human input.
Abstract
We propose an ontology-grounded approach to Knowledge Graph (KG) construction using Large Language Models (LLMs) on a knowledge base. An ontology is authored by generating Competency Questions (CQ) on knowledge base to discover knowledge scope, extracting relations from CQs, and attempt to replace equivalent relations by their counterpart in Wikidata. To ensure consistency and interpretability in the resulting KG, we ground generation of KG with the authored ontology based on extracted relations. Evaluation on benchmark datasets demonstrates competitive performance in knowledge graph construction task. Our work presents a promising direction for scalable KG construction pipeline with minimal human intervention, that yields high quality and human-interpretable KGs, which are interoperable with Wikidata semantics for potential knowledge base expansion.
