Table of Contents
Fetching ...

NoSQL Graph Databases: an overview

Veronica Santos, Bruno Cuconato

TL;DR

This paper surveys the landscape of NoSQL graph databases, clarifying the two dominant graph models—RDF and labeled property graphs (LPG)—and contrasting native versus non-native storage, query languages, and transactional guarantees. It surveys theoretical foundations, including graph representations, query patterns, and consistency/isolation trade-offs, and provides an in-depth comparison of two representative systems: AllegroGraph (RDF) and Neo4j (LPG). The analysis highlights how RDF-based triple stores and LPG-based property graphs approach data modeling, storage, query capabilities, and distribution, including ACID versus CAP considerations and the role of named graphs and index-free adjacency. The paper concludes that the graph-DB landscape is still heterogeneous with no single standard, underscoring the need for better alignment between academic frameworks and industry practice, particularly in partitioning strategies, schema management, and query optimization for graph workloads.

Abstract

Graphs are the most suitable structures for modeling objects and interactions in applications where component inter-connectivity is a key feature. There has been increased interest in graphs to represent domains such as social networks, web site link structures, and biology. Graph stores recently rose to prominence along the NoSQL movement. In this work we will focus on NOSQL graph databases, describing their peculiarities that sets them apart from other data storage and management solutions, and how they differ among themselves. We will also analyze in-depth two different graph database management systems - AllegroGraph and Neo4j that uses the most popular graph models used by NoSQL stores in practice: the resource description framework (RDF) and the labeled property graph (LPG), respectively.

NoSQL Graph Databases: an overview

TL;DR

This paper surveys the landscape of NoSQL graph databases, clarifying the two dominant graph models—RDF and labeled property graphs (LPG)—and contrasting native versus non-native storage, query languages, and transactional guarantees. It surveys theoretical foundations, including graph representations, query patterns, and consistency/isolation trade-offs, and provides an in-depth comparison of two representative systems: AllegroGraph (RDF) and Neo4j (LPG). The analysis highlights how RDF-based triple stores and LPG-based property graphs approach data modeling, storage, query capabilities, and distribution, including ACID versus CAP considerations and the role of named graphs and index-free adjacency. The paper concludes that the graph-DB landscape is still heterogeneous with no single standard, underscoring the need for better alignment between academic frameworks and industry practice, particularly in partitioning strategies, schema management, and query optimization for graph workloads.

Abstract

Graphs are the most suitable structures for modeling objects and interactions in applications where component inter-connectivity is a key feature. There has been increased interest in graphs to represent domains such as social networks, web site link structures, and biology. Graph stores recently rose to prominence along the NoSQL movement. In this work we will focus on NOSQL graph databases, describing their peculiarities that sets them apart from other data storage and management solutions, and how they differ among themselves. We will also analyze in-depth two different graph database management systems - AllegroGraph and Neo4j that uses the most popular graph models used by NoSQL stores in practice: the resource description framework (RDF) and the labeled property graph (LPG), respectively.

Paper Structure

This paper contains 20 sections, 2 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: The Graph Database Space
  • Figure 2: Binary structure of node and relationship store files in Neo4j versions 2.* robinson2015graph
  • Figure 3: Graph representation in Neo4j robinson2015graph
  • Figure 4: Cypher query returning the classes a student attends
  • Figure 5: Typical Neo4j cluster architecture neo4j-operations-manual
  • ...and 8 more figures