A Semantic Partitioning Method for Large-Scale Training of Knowledge Graph Embeddings
Yuhe Bai
TL;DR
This work addresses the limitations of traditional knowledge graph embeddings, which often ignore ontology information and struggle with large-scale training. It proposes ontology-based semantic partitioning that groups fact triplets by entity class, enabling parallel training while enriching embeddings with semantic cues. The approach preserves existing scoring functions and demonstrates model-dependent improvements on standard benchmarks (e.g., FB15K, FB15K-237), with DistMult showing notable gains in some settings. The method offers a scalable, ontology-informed framework suitable for improving downstream tasks like link prediction and entity typing in large KG environments.
Abstract
In recent years, knowledge graph embeddings have achieved great success. Many methods have been proposed and achieved state-of-the-art results in various tasks. However, most of the current methods present one or more of the following problems: (i) They only consider fact triplets, while ignoring the ontology information of knowledge graphs. (ii) The obtained embeddings do not contain much semantic information. Therefore, using these embeddings for semantic tasks is problematic. (iii) They do not enable large-scale training. In this paper, we propose a new algorithm that incorporates the ontology of knowledge graphs and partitions the knowledge graph based on classes to include more semantic information for parallel training of large-scale knowledge graph embeddings. Our preliminary results show that our algorithm performs well on several popular benchmarks.
