Construction of Knowledge Graphs: State and Challenges

Marvin Hofer; Daniel Obraczka; Alieh Saeedi; Hanna Köpcke; Erhard Rahm

Construction of Knowledge Graphs: State and Challenges

Marvin Hofer, Daniel Obraczka, Alieh Saeedi, Hanna Köpcke, Erhard Rahm

TL;DR

This paper surveys the state of knowledge graph construction with a focus on incremental and continuous updates. It defines four key KG construction requirements—input handling, incremental processing, tooling, and quality assurance—and details the tasks needed to build high-quality KGs, including metadata management and ontology management. The authors evaluate 23 KG-specific construction approaches and 7 toolsets against these requirements, highlighting limited support for incremental updates and open-source tooling. They conclude with open challenges and a call for modular, open-source pipelines and robust benchmarks to enable scalable, high-quality KG construction across domains.

Abstract

With knowledge graphs (KGs) at the center of numerous applications such as recommender systems and question answering, the need for generalized pipelines to construct and continuously update such KGs is increasing. While the individual steps that are necessary to create KGs from unstructured (e.g. text) and structured data sources (e.g. databases) are mostly well-researched for their one-shot execution, their adoption for incremental KG updates and the interplay of the individual steps have hardly been investigated in a systematic manner so far. In this work, we first discuss the main graph models for KGs and introduce the major requirement for future KG construction pipelines. Next, we provide an overview of the necessary steps to build high-quality KGs, including cross-cutting topics such as metadata management, ontology development, and quality assurance. We then evaluate the state of the art of KG construction w.r.t the introduced requirements for specific popular KGs as well as some recent tools and strategies for KG construction. Finally, we identify areas in need of further research and improvement.

Construction of Knowledge Graphs: State and Challenges

TL;DR

Abstract

Paper Structure (40 sections, 5 figures, 1 table)

This paper contains 40 sections, 5 figures, 1 table.

Introduction
KG background and requirements for KG construction
Knowledge Graph
Graph Models
Requirements of KG construction
Construction Tasks
Data Acquisition & Preprocessing
Source Selection & Filtering
Data Acquisition
Transformation & Mapping
Data Cleaning
Metadata Management
Ontology Management
Ontology Learning
Ontology/schema matching
...and 25 more sections

Figures (5)

Figure 1: Simplified Knowledge Graph (KG) example demonstrating integrated information from five domains, showcasing ten entities of eight types connected by twelve relationships (two distinct is-a relations). Dashed lines indicate semantic structures (ontology or graph schema) such as entity types. Inferences can be made based on the relationships and typing, revealing additional information such as the broader birthplace of Aphex Twin being Ireland and Xtal belonging to the Techno genre (Not all possible inferences are shown for clarity).
Figure 2: Incremental Knowledge Graph Construction Pipeline
Figure 3: Ontology and Entity Merging Strategies.
Figure 4: Knowledge Extraction steps for an example sentence linking entities and relations to the DBpedia KG.
Figure 5: Incremental entity resolution workflow.

Construction of Knowledge Graphs: State and Challenges

TL;DR

Abstract

Construction of Knowledge Graphs: State and Challenges

Authors

TL;DR

Abstract

Table of Contents

Figures (5)