Table of Contents
Fetching ...

Poseidon: A OneGraph Engine

Brad Bebee, Ümit V. Çatalyürek, Olaf Hartig, Ankesh Khandelwal, Simone Rondelli, Michael Schmidt, Lefteris Sidirourgos, Bryan Thompson

TL;DR

Poseidon introduces a high-performance HTAP graph engine for Neptune Analytics, unifying RDF and LPG data through the OneGraph 1G model and enabling openCypher with algorithm invocations. It combines a 13-relation in-memory storage layout, lock-free structures, and a durable logical log to achieve rapid loads and low-latency transactions while supporting complex analytics and vector search. The Poseidon Access Pattern Language (P8APL) abstracts physical storage, guiding a cost-based planner and a dataflow execution engine to efficiently run both transactional and analytical workloads on live graphs. The result is a scalable, interoperable graph platform that handles dynamic graphs, extensive graph analytics, and cross-model interoperability with significant performance benefits over traditional approaches.

Abstract

We present the Poseidon engine behind the Neptune Analytics graph database service. Customers interact with Poseidon using the declarative openCypher query language, which enables requests that seamlessly combine traditional querying paradigms (such as graph pattern matching, variable length paths, aggregation) with algorithm invocations and has been syntactically extended to facilitate OneGraph interoperability, such as the disambiguation between globally unique IRIs (as exposed via RDF) vs. local identifiers (as encountered in LPG data). Poseidon supports a broad range of graph workloads, from simple transactions, to top-k beam search algorithms on dynamic graphs, to whole graph analytics requiring multiple full passes over the data. For example, real-time fraud detection, like many other use cases, needs to reflect current committed state of the dynamic graph. If a users cell phone is compromised, then all newer actions by that user become immediately suspect. To address such dynamic graph use cases, Poseidon combines state-of-the-art transaction processing with novel graph data indexing, including lock-free maintenance of adjacency lists, secondary succinct indices, partitioned heaps for data tuple storage with uniform placement, and innovative statistics for cost-based query optimization. The Poseidon engine uses a logical log for durability, enabling rapid evolution of in-memory data structures. Bulk data loads achieve more than 10 million property values per second on many data sets while simple transactions can execute in under 20ms against the storage engine.

Poseidon: A OneGraph Engine

TL;DR

Poseidon introduces a high-performance HTAP graph engine for Neptune Analytics, unifying RDF and LPG data through the OneGraph 1G model and enabling openCypher with algorithm invocations. It combines a 13-relation in-memory storage layout, lock-free structures, and a durable logical log to achieve rapid loads and low-latency transactions while supporting complex analytics and vector search. The Poseidon Access Pattern Language (P8APL) abstracts physical storage, guiding a cost-based planner and a dataflow execution engine to efficiently run both transactional and analytical workloads on live graphs. The result is a scalable, interoperable graph platform that handles dynamic graphs, extensive graph analytics, and cross-model interoperability with significant performance benefits over traditional approaches.

Abstract

We present the Poseidon engine behind the Neptune Analytics graph database service. Customers interact with Poseidon using the declarative openCypher query language, which enables requests that seamlessly combine traditional querying paradigms (such as graph pattern matching, variable length paths, aggregation) with algorithm invocations and has been syntactically extended to facilitate OneGraph interoperability, such as the disambiguation between globally unique IRIs (as exposed via RDF) vs. local identifiers (as encountered in LPG data). Poseidon supports a broad range of graph workloads, from simple transactions, to top-k beam search algorithms on dynamic graphs, to whole graph analytics requiring multiple full passes over the data. For example, real-time fraud detection, like many other use cases, needs to reflect current committed state of the dynamic graph. If a users cell phone is compromised, then all newer actions by that user become immediately suspect. To address such dynamic graph use cases, Poseidon combines state-of-the-art transaction processing with novel graph data indexing, including lock-free maintenance of adjacency lists, secondary succinct indices, partitioned heaps for data tuple storage with uniform placement, and innovative statistics for cost-based query optimization. The Poseidon engine uses a logical log for durability, enabling rapid evolution of in-memory data structures. Bulk data loads achieve more than 10 million property values per second on many data sets while simple transactions can execute in under 20ms against the storage engine.

Paper Structure

This paper contains 22 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: 1D and 2D partitioning of Poseidon relations.
  • Figure 2: Dictionary encoding to gui.
  • Figure 3: Comparison of index-based, scan-based and hybrid (default) BFS performance.
  • Figure 4: Closeness Centrality Performance on test datasets, using $n'=2,048$ source vertices.
  • Figure 5: Performance of the EgoNet variants.