CleanGraph: Human-in-the-loop Knowledge Graph Refinement and Completion
Tyler Bikaun, Michael Stewart, Wei Liu
TL;DR
CleanGraph presents an interactive, web-based platform for refining and completing knowledge graphs with a human-in-the-loop approach. It combines intuitive visualization, comprehensive CRUD operations, and a plugin-based architecture that allows arbitrary Knowledge Graph Refinement (KGR) and Knowledge Graph Completion (KGC) models to be integrated and used within the UI. Core contributions include subgraph-aware editing (including 1-hop deletions and node merges), a force-directed, frequency-weighted graph visualization, and a flexible plugin interface (with EDMs and CMs) that supports domain-specific quality assurance. This work enables domain experts to iteratively verify and enhance KG quality, thereby improving reliability for downstream tasks such as information retrieval and QA, while maintaining an open, extensible framework for future model integration and RDF-compatible expansion.
Abstract
This paper presents CleanGraph, an interactive web-based tool designed to facilitate the refinement and completion of knowledge graphs. Maintaining the reliability of knowledge graphs, which are grounded in high-quality and error-free facts, is crucial for real-world applications such as question-answering and information retrieval systems. These graphs are often automatically assembled from textual sources by extracting semantic triples via information extraction. However, assuring the quality of these extracted triples, especially when dealing with large or low-quality datasets, can pose a significant challenge and adversely affect the performance of downstream applications. CleanGraph allows users to perform Create, Read, Update, and Delete (CRUD) operations on their graphs, as well as apply models in the form of plugins for graph refinement and completion tasks. These functionalities enable users to enhance the integrity and reliability of their graph data. A demonstration of CleanGraph and its source code can be accessed at https://github.com/nlp-tlp/CleanGraph under the MIT License.
