How to evaluate NoSQL Database Paradigms for Knowledge Graph Processing
Rosario Napoli, Antonio Celesti, Massimo Villari, Maria Fazio
TL;DR
The paper tackles the problem of selecting NoSQL DBMS paradigms for Knowledge Graph processing by introducing a KG-specific benchmarking framework that accounts for scale, connectivity, and semantic richness via $S(KG)=|V|+|E|$, $CD(KG)=|E|/|V|$, and $SR(KG)=D_{types}+H(C)+H(R)$. It conducts a reproducible evaluation on the FAERS KG across three scales using three paradigms (Neo4j, MongoDB, ArangoDB) and a four-tier query workload to identify crossover points and derive evidence-based guidelines for paradigm selection. The results reveal clear trade-offs: document stores excel at simple attribute filtering, multi-model systems balance mixed workloads, and graph-native engines dominate deep traversals in semantically rich, highly connected graphs, with crossover points tied to $SR(KG)$ and $CD(KG)$. The study advances KG infrastructure practice by translating ad-hoc choices into data-driven decisions and suggests avenues for automated, self-adapting storage frameworks guided by KG characteristics and query requirements.
Abstract
Knowledge Graph (KG) processing faces critical infrastructure challenges in selecting optimal NoSQL database paradigms, as traditional performance evaluations rely on static benchmarks that fail to capture the complexity of real-world KG workloads. Although the big data field offers numerous comparative studies, in the KG context DBMS selection remains predominantly ad-hoc, leaving practitioners without systematic guidance for matching storage technologies to specific KG characteristics and query requirements. This paper presents a KG-specific benchmarking framework that employs connectivity density, scale, and introduces a graph-centric metric, namely Semantic Richness (SR), within a four-tier query methodology to reveal performance crossover points across Document-Oriented, Graph, and Multi-Model DBMSs. We conduct an empirical evaluation on the FAERS adverse event KG at three scales, comparing paradigms from simple filtering to deep traversal, and provide metric-driven, evidence-based guidelines for aligning NoSQL paradigm selection with graph size, connectivity, and semantic richness.
