A Survey on Knowledge Graph Structure and Knowledge Graph Embeddings
Jeffrey Sardina, John D. Kelleher, Declan O'Sullivan
TL;DR
This survey addresses how Knowledge Graph structure and KGEM hyperparameters jointly influence link prediction ($LP$) performance, highlighting biases driven by degree distribution and topological imbalance. It consolidates findings from numerous studies across structure metrics, hyperparameter choices, and benchmark KGEM evaluations to reveal consistent effects of node degree, centrality, and relation frequencies on learning and evaluation. A key takeaway is that hyperparameter preference is context-dependent, varying with both the KG and KGEM, and that many reported gains may be due to structure-induced biases rather than intrinsic model improvements. The authors advocate for structure-aware evaluation, ontological considerations, and structurally controlled benchmarks to advance holistic understanding and fair comparisons in KGEM research.
Abstract
Knowledge Graphs (KGs) and their machine learning counterpart, Knowledge Graph Embedding Models (KGEMs), have seen ever-increasing use in a wide variety of academic and applied settings. In particular, KGEMs are typically applied to KGs to solve the link prediction task; i.e. to predict new facts in the domain of a KG based on existing, observed facts. While this approach has been shown substantial power in many end-use cases, it remains incompletely characterised in terms of how KGEMs react differently to KG structure. This is of particular concern in light of recent studies showing that KG structure can be a significant source of bias as well as partially determinant of overall KGEM performance. This paper seeks to address this gap in the state-of-the-art. This paper provides, to the authors' knowledge, the first comprehensive survey exploring established relationships of Knowledge Graph Embedding Models and Graph structure in the literature. It is the hope of the authors that this work will inspire further studies in this area, and contribute to a more holistic understanding of KGs, KGEMs, and the link prediction task.
