Assisted Knowledge Graph Authoring: Human-Supervised Knowledge Graph Construction from Natural Language
Marcel Gohsen, Benno Stein
TL;DR
Domain-specific knowledge graphs are scarce in large encyclopedic resources, hindering domain applications. The authors present WAKA, a web-based tool that lets domain experts author Wikidata-grounded knowledge graphs from natural language, combining a text editor with an interactive graph view. The construction pipeline integrates entity discovery (NER and Wikidata linking), relation extraction via mREBEL, knowledge fusion to RDF, and an NLI-based ranking to select plausible triples. Evaluation on the REDFM dataset shows high recall in early stages but modest precision and F1 overall, underscoring the need for human supervision to produce reliable domain-specific KG assets with practical impact.
Abstract
Encyclopedic knowledge graphs, such as Wikidata, host an extensive repository of millions of knowledge statements. However, domain-specific knowledge from fields such as history, physics, or medicine is significantly underrepresented in those graphs. Although few domain-specific knowledge graphs exist (e.g., Pubmed for medicine), developing specialized retrieval applications for many domains still requires constructing knowledge graphs from scratch. To facilitate knowledge graph construction, we introduce WAKA: a Web application that allows domain experts to create knowledge graphs through the medium with which they are most familiar: natural language.
