Table of Contents
Fetching ...

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language

Marcel Gohsen, Benno Stein

TL;DR

Domain-specific knowledge graphs are scarce in large encyclopedic resources, hindering domain applications. The authors present WAKA, a web-based tool that lets domain experts author Wikidata-grounded knowledge graphs from natural language, combining a text editor with an interactive graph view. The construction pipeline integrates entity discovery (NER and Wikidata linking), relation extraction via mREBEL, knowledge fusion to RDF, and an NLI-based ranking to select plausible triples. Evaluation on the REDFM dataset shows high recall in early stages but modest precision and F1 overall, underscoring the need for human supervision to produce reliable domain-specific KG assets with practical impact.

Abstract

Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. Although few domain-specific knowledge graphs exist (e.g., Pubmed for medicine), developing specialized retrieval applications for many domains still requires constructing knowledge graphs from scratch. To facilitate knowledge graph construction, we introduce WAKA: a Web application that allows domain experts to create knowledge graphs through the medium with which they are most familiar: natural language.

Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language

TL;DR

Domain-specific knowledge graphs are scarce in large encyclopedic resources, hindering domain applications. The authors present WAKA, a web-based tool that lets domain experts author Wikidata-grounded knowledge graphs from natural language, combining a text editor with an interactive graph view. The construction pipeline integrates entity discovery (NER and Wikidata linking), relation extraction via mREBEL, knowledge fusion to RDF, and an NLI-based ranking to select plausible triples. Evaluation on the REDFM dataset shows high recall in early stages but modest precision and F1 overall, underscoring the need for human supervision to produce reliable domain-specific KG assets with practical impact.

Abstract

Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. Although few domain-specific knowledge graphs exist (e.g., Pubmed for medicine), developing specialized retrieval applications for many domains still requires constructing knowledge graphs from scratch. To facilitate knowledge graph construction, we introduce WAKA: a Web application that allows domain experts to create knowledge graphs through the medium with which they are most familiar: natural language.
Paper Structure (12 sections, 2 equations, 2 figures, 1 table)

This paper contains 12 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Visualization of a knowledge graph in the WAKA frontend in which the entity university is highlighted.
  • Figure 2: Architecture of the automatic knowledge graph construction approach.