Table of Contents
Fetching ...

Comparison of Metadata Representation Models for Knowledge Graph Embeddings

Shusaku Egami, Kyoumoto Matsushita, Takanori Ugai, Ken Fukuda

TL;DR

This work addresses how metadata representation models (REF, SGP, RDF-star) affect link prediction in hyper-relational knowledge graphs and why fair cross-MRM evaluation is difficult. It introduces an Evaluation Method Diagnosis to surface task semantics, dataset complexity, and KGE limitations, and then proposes a unified LP framework combining RDF-star-based pre-training with MRMs-aware LP models (TransE and TransU) to reflect each MRM’s latent-space structure. Empirical results show REF excels on simple HRKGs (WD50K), SGP underperforms, and MRMs converge in performance on complex HRKGs (KGRC-RDF), with hyperparameters such as walk depth and QT-walks significantly steering results. The findings guide optimal knowledge representation strategies for LP in HRKGs and establish a fair benchmarking approach across MRMs, contributing to robust, interpretable KGE across diverse metadata representations.

Abstract

Hyper-relational Knowledge Graphs (HRKGs) extend traditional KGs beyond binary relations, enabling the representation of contextual, provenance, and temporal information in domains, such as historical events, sensor data, video content, and narratives. HRKGs can be structured using several Metadata Representation Models (MRMs), including Reification (REF), Singleton Property (SGP), and RDF-star (RDR). However, the effects of different MRMs on KG Embedding (KGE) and Link Prediction (LP) models remain unclear. This study evaluates MRMs in the context of LP tasks, identifies the limitations of existing evaluation frameworks, and introduces a new task that ensures fair comparisons across MRMs. Furthermore, we propose a framework that effectively reflects the knowledge representations of the three MRMs in latent space. Experiments on two types of datasets reveal that REF performs well in simple HRKGs, whereas SGP is less effective. However, in complex HRKGs, the differences among MRMs in the LP tasks are minimal. Our findings contribute to an optimal knowledge representation strategy for HRKGs in LP tasks.

Comparison of Metadata Representation Models for Knowledge Graph Embeddings

TL;DR

This work addresses how metadata representation models (REF, SGP, RDF-star) affect link prediction in hyper-relational knowledge graphs and why fair cross-MRM evaluation is difficult. It introduces an Evaluation Method Diagnosis to surface task semantics, dataset complexity, and KGE limitations, and then proposes a unified LP framework combining RDF-star-based pre-training with MRMs-aware LP models (TransE and TransU) to reflect each MRM’s latent-space structure. Empirical results show REF excels on simple HRKGs (WD50K), SGP underperforms, and MRMs converge in performance on complex HRKGs (KGRC-RDF), with hyperparameters such as walk depth and QT-walks significantly steering results. The findings guide optimal knowledge representation strategies for LP in HRKGs and establish a fair benchmarking approach across MRMs, contributing to robust, interpretable KGE across diverse metadata representations.

Abstract

Hyper-relational Knowledge Graphs (HRKGs) extend traditional KGs beyond binary relations, enabling the representation of contextual, provenance, and temporal information in domains, such as historical events, sensor data, video content, and narratives. HRKGs can be structured using several Metadata Representation Models (MRMs), including Reification (REF), Singleton Property (SGP), and RDF-star (RDR). However, the effects of different MRMs on KG Embedding (KGE) and Link Prediction (LP) models remain unclear. This study evaluates MRMs in the context of LP tasks, identifies the limitations of existing evaluation frameworks, and introduces a new task that ensures fair comparisons across MRMs. Furthermore, we propose a framework that effectively reflects the knowledge representations of the three MRMs in latent space. Experiments on two types of datasets reveal that REF performs well in simple HRKGs, whereas SGP is less effective. However, in complex HRKGs, the differences among MRMs in the LP tasks are minimal. Our findings contribute to an optimal knowledge representation strategy for HRKGs in LP tasks.

Paper Structure

This paper contains 33 sections, 3 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: Example of metadata representation models
  • Figure 2: Overview of experimental approach
  • Figure 3: Hyperparameter importance
  • Figure 4: Optimization process of walking strategy parameters
  • Figure 5: Relationship between graph walk depth and LP accuracy (REF)
  • ...and 5 more figures