Exploring Weighted Property Approaches for RDF Graph Similarity Measure
Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou
TL;DR
This work addresses the problem of RDF graph similarity by introducing weighted properties to reflect context-dependent importance of different predicates. It proposes a hybrid similarity framework that combines quantitative object comparisons via a distance-based term $Sim_1$ and qualitative text-based similarity via $Sim_2$, yielding an overall score $Sim(G_1,G_2)$ normalized by the sum of weights. The approach is instantiated in the vehicle domain, where weighted variants $P1$–$P11$ are evaluated against baselines PJ, PS, and P0; results indicate that weighted-property variants, particularly $P11$, achieve higher similarity scores and better reflect perceived graph similarity. The study highlights benefits in accuracy and context-sensitivity while also acknowledging challenges in weight calibration and scalability, and points to future work on systematic weight assignment and broader domain applications. The work contributes a pragmatic mechanism to tailor RDF graph similarity to application needs, enabling more precise knowledge discovery and semantic search in domains with heterogeneous property importance.
Abstract
Measuring similarity between RDF graphs is essential for various applications, including knowledge discovery, semantic web analysis, and recommender systems. However, traditional similarity measures often treat all properties equally, potentially overlooking the varying importance of different properties in different contexts. Consequently, exploring weighted property approaches for RDF graph similarity measure presents an intriguing avenue for investigation. Therefore, in this paper, we propose a weighted property approach for RDF graph similarity measure to address this limitation. Our approach incorporates the relative importance of properties into the similarity calculation, enabling a more nuanced and context-aware measures of similarity. We evaluate our approach through a comprehensive experimental study on an RDF graph dataset in the vehicle domain. Our results demonstrate that the proposed approach achieves promising accuracy and effectively reflects the perceived similarity between RDF graphs.
