Knowledge Graph Extension by Entity Type Recognition
Daqian Shi
TL;DR
The work addresses automatic knowledge graph extension amid heterogeneous knowledge graphs by introducing an entity-type-recognition–driven framework. It integrates data preparation, entity-type recognition (both schema- and instance-level), a knowledge-graph-extension algorithm, and dedicated assessment via Focus metrics, underpinned by property-based similarity signals formalized through FCA lattices. A key technical contribution is the trio of property-based metrics—horizontal, vertical, and informational similarity—plus a machine learning–based recognizer that leverages these signals alongside traditional lexical/semantic features, validated on ConfTrack, EnType, and related datasets. The LiveSchema platform operationalizes the framework, providing data-layer management, KG embedding, path recommendations, and end-to-end extension case studies, demonstrating practical applicability and scalability. Collectively, the approach advances automatic KG extension by delivering a standardized framework, rigorous evaluation metrics, and a working platform to support knowledge engineers in constructing richer, more coherent knowledge graphs across domains.
Abstract
Knowledge graphs have emerged as a sophisticated advancement and refinement of semantic networks, and their deployment is one of the critical methodologies in contemporary artificial intelligence. The construction of knowledge graphs is a multifaceted process involving various techniques, where researchers aim to extract the knowledge from existing resources for the construction since building from scratch entails significant labor and time costs. However, due to the pervasive issue of heterogeneity, the description diversity across different knowledge graphs can lead to mismatches between concepts, thereby impacting the efficacy of knowledge extraction. This Ph.D. study focuses on automatic knowledge graph extension, i.e., properly extending the reference knowledge graph by extracting and integrating concepts from one or more candidate knowledge graphs. We propose a novel knowledge graph extension framework based on entity type recognition. The framework aims to achieve high-quality knowledge extraction by aligning the schemas and entities across different knowledge graphs, thereby enhancing the performance of the extension. This paper elucidates three major contributions: (i) we propose an entity type recognition method exploiting machine learning and property-based similarities to enhance knowledge extraction; (ii) we introduce a set of assessment metrics to validate the quality of the extended knowledge graphs; (iii) we develop a platform for knowledge graph acquisition, management, and extension to benefit knowledge engineers practically. Our evaluation comprehensively demonstrated the feasibility and effectiveness of the proposed extension framework and its functionalities through quantitative experiments and case studies.
