Table of Contents
Fetching ...

Joining Entities Across Relation and Graph with a Unified Model

Wenzhi Fu

TL;DR

The paper addresses the challenge of performing unified analytics across graphs and relations inside an RDBMS. It introduces the Relational Genetic (RG) model that encodes graphs with pointers to preserve topology, a graph-pattern SQL dialect SQL_δ, and an exploration operator to enable graph pattern matching within declarative queries. A lightweight system, WhiteDB, demonstrates that hybrid queries can be optimized in an enlarged plan space and executed entirely inside the RDBMS, achieving orders-of-magnitude speedups on pattern queries compared with relational baselines and competitive results with native graph engines for larger patterns. The work enables seamless data enrichment and cross-model querying without cross-engine data movement, offering practical impact for practitioners needing dynamic, in-database graph analytics and hybrid data processing.

Abstract

This paper introduces RG (Relational Genetic) model, a revised relational model to represent graph-structured data in RDBMS while preserving its topology, for efficiently and effectively extracting data in different formats from disparate sources. Along with: (a) SQL$_δ$, an SQL dialect augmented with graph pattern queries and tuple-vertex joins, such that one can extract graph properties via graph pattern matching, and "semantically" match entities across relations and graphs; (b) a logical representation of graphs in RDBMS, which introduces an exploration operator for efficient pattern querying, supports also browsing and updating graph-structured data; and (c) a strategy to uniformly evaluate SQL, pattern and hybrid queries that join tuples and vertices, all inside an RDBMS by leveraging its optimizer without performance degradation on switching different execution engines. A lightweight system, WhiteDB, is developed as an implementation to evaluate the benefits it can actually bring on real-life data. We empirically verified that the RG model enables the graph pattern queries to be answered as efficiently as in native graph engines; can consider the access on graph and relation in any order for optimal plan; and supports effective data enrichment.

Joining Entities Across Relation and Graph with a Unified Model

TL;DR

The paper addresses the challenge of performing unified analytics across graphs and relations inside an RDBMS. It introduces the Relational Genetic (RG) model that encodes graphs with pointers to preserve topology, a graph-pattern SQL dialect SQL_δ, and an exploration operator to enable graph pattern matching within declarative queries. A lightweight system, WhiteDB, demonstrates that hybrid queries can be optimized in an enlarged plan space and executed entirely inside the RDBMS, achieving orders-of-magnitude speedups on pattern queries compared with relational baselines and competitive results with native graph engines for larger patterns. The work enables seamless data enrichment and cross-model querying without cross-engine data movement, offering practical impact for practitioners needing dynamic, in-database graph analytics and hybrid data processing.

Abstract

This paper introduces RG (Relational Genetic) model, a revised relational model to represent graph-structured data in RDBMS while preserving its topology, for efficiently and effectively extracting data in different formats from disparate sources. Along with: (a) SQL, an SQL dialect augmented with graph pattern queries and tuple-vertex joins, such that one can extract graph properties via graph pattern matching, and "semantically" match entities across relations and graphs; (b) a logical representation of graphs in RDBMS, which introduces an exploration operator for efficient pattern querying, supports also browsing and updating graph-structured data; and (c) a strategy to uniformly evaluate SQL, pattern and hybrid queries that join tuples and vertices, all inside an RDBMS by leveraging its optimizer without performance degradation on switching different execution engines. A lightweight system, WhiteDB, is developed as an implementation to evaluate the benefits it can actually bring on real-life data. We empirically verified that the RG model enables the graph pattern queries to be answered as efficiently as in native graph engines; can consider the access on graph and relation in any order for optimal plan; and supports effective data enrichment.
Paper Structure (27 sections, 7 equations, 14 figures, 5 tables, 3 algorithms)

This paper contains 27 sections, 7 equations, 14 figures, 5 tables, 3 algorithms.

Figures (14)

  • Figure 1: An example hybrid query
  • Figure 2: Data representation and query evaluation overview
  • Figure 3: Relations and graphs in the $$RG model
  • Figure 4: Memory layout and physical design
  • Figure 5: A demonstration of the operators introduced in the $$RG model
  • ...and 9 more figures