Table of Contents
Fetching ...

Continuous Subgraph Matching via Cost-Model-based Dynamic Vertex Dominance Embeddings (Technical Report)

Yutong Ye, Xiang Lian, Nan Zhang, Mingsong Chen

TL;DR

CSM on dynamic graphs is challenged by continuous updates and the NP-hardness of subgraph isomorphism. The paper presents DIVINE, a framework that converts CSM into a dominating-region search in a vertex-embedding space, using dynamic vertex dominance embeddings (SPUR/SPAN) and cost-model-guided embedding design to prune candidates. It introduces degree grouping and DAS$^3$ synopses to bolster pruning for high-degree vertices and enables incremental maintenance with linear space relative to the graph size; the embedding update costs are $O(d)$ per edge update. A novel cost model defines $Cost_{CSM}$ and motivates the optimized embedding $o'_{C}(v_i)$ with $o'_{C}(v_i)=\alpha(x'_i||y'_i)+\beta z_i$, where $x'_i=f_Z(l(v_i))$ is Zipf-based, yielding strong pruning and low query cost. Extensive experiments on real and synthetic graphs show DIVINE achieving substantial efficiency gains over baselines, with scalable offline DAS$^3$ construction and fast online CSM processing.

Abstract

In many real-world applications such as social network analysis, knowledge graph discovery, biological network analytics, and so on, graph data management has become increasingly important and has drawn much attention from the database community. While many graphs (e.g., Twitter, Wikipedia, etc.) are usually evolving over time, it is of great importance to study the \textit{continuous subgraph matching} (CSM) problem, a fundamental, yet challenging, graph operator, which continuously monitors subgraph matching results over dynamic graphs with a stream of edge updates. To efficiently tackle the CSM problem, we carefully design a general CSM processing framework, based on novel \textit{\underline{D}ynam\underline{I}c \underline{V}ertex Dom\underline{IN}ance \underline{E}mbedding} (DIVINE), which maps vertex neighborhoods into an embedding space to enable efficient subgraph matching and incremental maintenance under dynamic updates. Inspired by low pruning power for high-degree vertices, we propose a new \textit{degree grouping} technique to decompose high-degree star patterns into groups of lower-degree star substructures, and devise \textit{degree-aware star substructure synopses} (DAS$^3$) over embeddings of star substructure groups. We develop efficient algorithms to incrementally maintain dynamic graphs and answer CSM queries by traversing DAS$^3$ synopses and applying our designed \textit{vertex dominance} and \textit{range pruning strategies}. Through extensive experiments, we confirm the efficiency of our proposed DIVINE approach over both real and synthetic graphs.

Continuous Subgraph Matching via Cost-Model-based Dynamic Vertex Dominance Embeddings (Technical Report)

TL;DR

CSM on dynamic graphs is challenged by continuous updates and the NP-hardness of subgraph isomorphism. The paper presents DIVINE, a framework that converts CSM into a dominating-region search in a vertex-embedding space, using dynamic vertex dominance embeddings (SPUR/SPAN) and cost-model-guided embedding design to prune candidates. It introduces degree grouping and DAS synopses to bolster pruning for high-degree vertices and enables incremental maintenance with linear space relative to the graph size; the embedding update costs are per edge update. A novel cost model defines and motivates the optimized embedding with , where is Zipf-based, yielding strong pruning and low query cost. Extensive experiments on real and synthetic graphs show DIVINE achieving substantial efficiency gains over baselines, with scalable offline DAS construction and fast online CSM processing.

Abstract

In many real-world applications such as social network analysis, knowledge graph discovery, biological network analytics, and so on, graph data management has become increasingly important and has drawn much attention from the database community. While many graphs (e.g., Twitter, Wikipedia, etc.) are usually evolving over time, it is of great importance to study the \textit{continuous subgraph matching} (CSM) problem, a fundamental, yet challenging, graph operator, which continuously monitors subgraph matching results over dynamic graphs with a stream of edge updates. To efficiently tackle the CSM problem, we carefully design a general CSM processing framework, based on novel \textit{\underline{D}ynam\underline{I}c \underline{V}ertex Dom\underline{IN}ance \underline{E}mbedding} (DIVINE), which maps vertex neighborhoods into an embedding space to enable efficient subgraph matching and incremental maintenance under dynamic updates. Inspired by low pruning power for high-degree vertices, we propose a new \textit{degree grouping} technique to decompose high-degree star patterns into groups of lower-degree star substructures, and devise \textit{degree-aware star substructure synopses} (DAS) over embeddings of star substructure groups. We develop efficient algorithms to incrementally maintain dynamic graphs and answer CSM queries by traversing DAS synopses and applying our designed \textit{vertex dominance} and \textit{range pruning strategies}. Through extensive experiments, we confirm the efficiency of our proposed DIVINE approach over both real and synthetic graphs.
Paper Structure (30 sections, 4 theorems, 11 equations, 25 figures, 3 tables, 4 algorithms)

This paper contains 30 sections, 4 theorems, 11 equations, 25 figures, 3 tables, 4 algorithms.

Key Result

lemma 1

(The Dominance Property of Vertex Embeddings) Given a unit star subgraph $g_{v_i}$ centered at vertex $v_i$ and any of its star substructures $s_{v_i}$ (i.e., $s_{v_i}\subseteq g_{v_i}$), their vertex embeddings satisfy the dominance condition that: $o(s_{v_i}) \preceq o(g_{v_i})$ (including $o(s_{v

Figures (25)

  • Figure 1: An example of the subgraph matching in dynamic collaboration networks $G_D$.
  • Figure 2: Illustration of unit star subgraph and substructures.
  • Figure 3: Illustration of vertex dominance embedding $o(v_1)$ and the optimized vertex dominance embedding $o'(v_1)$.
  • Figure 4: An example of vertex dominance embeddings in subgraph matching ($||$ is the concatenation of two vectors).
  • Figure 5: An example of vertex embeddings for CSM.
  • ...and 20 more figures

Theorems & Definitions (15)

  • Example 1
  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • lemma 1
  • Example 2
  • Example 3
  • lemma 2
  • ...and 5 more