Condensed Representation of RDF and its Application on Graph Versioning
Jey Puget Gil, Emmanuel Coquery, John Samuel, Gilles Gesquiere
TL;DR
This work formalizes a condensed representation for evolving RDF graphs and develops QuaQue, a system within the ConVer-G framework, to query across multiple graph versions. It introduces flat and condensed dataset models, an extended SPARQL algebra, and metadata operators to support version-aware querying, along with proofs of equivalence between representations. Benchmark results show that condensed representations save space and can offer favorable performance for certain aggregative workloads, while flat representations excel for non-aggregative queries. The approach aims to enable multi-view, provenance-aware analysis of evolving knowledge graphs and integrates with existing RDF infrastructure through a translation layer. Overall, it provides a principled foundation for efficient, version-aware RDF data management with practical implications for domains like urban data, biomedical informatics, and software evolution.
Abstract
Evolving phenomena, often complex, can be represented using knowledge graphs, which have the capability to model heterogeneous data from multiple sources. Nowadays, a considerable amount of sources delivering periodic updates to knowledge graphs in various domains is openly available. The evolution of data is of interest to knowledge graph management systems, and therefore it is crucial to organize these constantly evolving data to make them easily accessible and exploitable for analysis. In this article, we will present and formalize the condensed representation of these evolving graphs and propose a new solution called QuaQue that allows querying across multiple versions of graphs and we also present the results of our benchmark comparing our solution against existing approaches.
