Table of Contents
Fetching ...

Towards a Converged Relational-Graph Optimization Framework

Yunkai Lou, Longbin Lai, Bingqing Lyu, Yufan Yang, Xiaoli Zhou, Wenyuan Yu, Ying Zhang, Jingren Zhou

TL;DR

RelGo addresses the optimization gap created by SQL/PGQ's graph-pattern queries by introducing SPJM, a converged skeleton that unifies relational and graph semantics. It combines graph-aware decomposition-based optimization with relational planning, leveraging a GRainDB-style graph index and Calcite-based relational optimization, integrated into a DuckDB runtime. The framework demonstrates substantial end-to-end performance gains over graph-agnostic baselines and competing systems, with average speedups up to 21.9x on LDBC SNB and strong gains on cyclic patterns. This work has practical impact for executing graph-embedded analytics inside relational DBMSs, enabling efficient, scalable SQL/PGQ query processing.

Abstract

The recent ISO SQL:2023 standard adopts SQL/PGQ (Property Graph Queries), facilitating graph-like querying within relational databases. This advancement, however, underscores a significant gap in how to effectively optimize SQL/PGQ queries within relational database systems. To address this gap, we extend the foundational SPJ (Select-Project-Join) queries to SPJM queries, which include an additional matching operator for representing graph pattern matching in SQL/PGQ. Although SPJM queries can be converted to SPJ queries and optimized using existing relational query optimizers, our analysis shows that such a graph-agnostic method fails to benefit from graph-specific optimization techniques found in the literature. To address this issue, we develop a converged relational-graph optimization framework called RelGo for optimizing SPJM queries, leveraging joint efforts from both relational and graph query optimizations. Using DuckDB as the underlying relational execution engine, our experiments show that RelGo can generate efficient execution plans for SPJM queries. On well-established benchmarks, these plans exhibit an average speedup of 21.90x compared to those produced by the graph-agnostic optimizer.

Towards a Converged Relational-Graph Optimization Framework

TL;DR

RelGo addresses the optimization gap created by SQL/PGQ's graph-pattern queries by introducing SPJM, a converged skeleton that unifies relational and graph semantics. It combines graph-aware decomposition-based optimization with relational planning, leveraging a GRainDB-style graph index and Calcite-based relational optimization, integrated into a DuckDB runtime. The framework demonstrates substantial end-to-end performance gains over graph-agnostic baselines and competing systems, with average speedups up to 21.9x on LDBC SNB and strong gains on cyclic patterns. This work has practical impact for executing graph-embedded analytics inside relational DBMSs, enabling efficient, scalable SQL/PGQ query processing.

Abstract

The recent ISO SQL:2023 standard adopts SQL/PGQ (Property Graph Queries), facilitating graph-like querying within relational databases. This advancement, however, underscores a significant gap in how to effectively optimize SQL/PGQ queries within relational database systems. To address this gap, we extend the foundational SPJ (Select-Project-Join) queries to SPJM queries, which include an additional matching operator for representing graph pattern matching in SQL/PGQ. Although SPJM queries can be converted to SPJ queries and optimized using existing relational query optimizers, our analysis shows that such a graph-agnostic method fails to benefit from graph-specific optimization techniques found in the literature. To address this issue, we develop a converged relational-graph optimization framework called RelGo for optimizing SPJM queries, leveraging joint efforts from both relational and graph query optimizations. Using DuckDB as the underlying relational execution engine, our experiments show that RelGo can generate efficient execution plans for SPJM queries. On well-established benchmarks, these plans exhibit an average speedup of 21.90x compared to those produced by the graph-agnostic optimizer.
Paper Structure (31 sections, 2 theorems, 10 equations, 12 figures, 1 table)

This paper contains 31 sections, 2 theorems, 10 equations, 12 figures, 1 table.

Key Result

Lemma 1

Under $$RGMapping, the matching operation in an $$SPJM query can be losslessly transformed into a sequence of relational joins involving $n$ vertex relations and $m$ edge relations.

Figures (12)

  • Figure 1: An example of SQL/PGQ query.
  • Figure 2: An example of $$RGMapping.
  • Figure 3: Example of decomposition trees and the corresponding logical plans. Note that sub-pattern $\mathcal{P}_2$ can be a leaf node, but it cannot be an intermediate node.
  • Figure 4: Compare the search space and optimization time.
  • Figure 5: The graph index constructed among relations ${ R_{\text{Person}}}$, ${ R_{\text{Likes}}}$ and ${ R_{\text{Message}}}$ in Fig. \ref{['fig:intro-rgmapping-example']}(a).
  • ...and 7 more figures

Theorems & Definitions (9)

  • Example 1
  • Example 2
  • Definition 1: Matching Operator, $\mathcal{M}$
  • Example 3
  • Lemma 1
  • Example 4
  • Remark 1
  • Theorem 1
  • Example 5