Table of Contents
Fetching ...

Incremental Sliding Window Connectivity over Streaming Graphs

Chao Zhang, Angela Bonifati, M. Tamer Özsu

TL;DR

This work tackles connectivity queries in sliding windows over streaming graphs, where deletions of expired edges make index maintenance costly. It introduces the Bidirectional Incremental model (BIC), which partitions streaming edges into chunks and maintains forward and backward buffers that are incrementally updated and then merged to answer $Q_c(s,t)$ without deleting edges from the index. The framework leverages incremental connectivity via Union-Find Trees in buffers, snapshot isolation with augmented trees, and a backward-forward bipartite graph (BFBG) to support efficient inter-buffer merging. Theoretical analysis shows near $O(\log n)$ query time and amortized updates, while experiments on eight real and two synthetic graphs demonstrate substantial throughput gains (up to 14x) and dramatic tail-latency reductions (up to 3900x) over state-of-the-art baselines, validating BIC's practicality for real-time graph analytics.

Abstract

We study index-based processing for connectivity queries within sliding windows on streaming graphs. These queries, which determine whether two vertices belong to the same connected component, are fundamental operations in real-time graph data processing and demand high throughput and low latency. While indexing methods that leverage data structures for fully dynamic connectivity can facilitate efficient query processing, they encounter significant challenges with deleting expired edges from the window during window updates. We introduce a novel indexing approach that eliminates the need for physically performing edge deletions. This is achieved through a unique bidirectional incremental computation framework, referred to as the BIC model. The BIC model implements two distinct incremental computations to compute connected components within the window, operating along and against the timeline, respectively. These computations are then merged to efficiently compute queries in the window. We propose techniques for optimized index storage, incremental index updates, and efficient query processing to improve BIC effectiveness. Empirically, BIC achieves a 14$\times$ increase in throughput and a reduction in P95 latency by up to 3900$\times$ when compared to state-of-the-art indexes.

Incremental Sliding Window Connectivity over Streaming Graphs

TL;DR

This work tackles connectivity queries in sliding windows over streaming graphs, where deletions of expired edges make index maintenance costly. It introduces the Bidirectional Incremental model (BIC), which partitions streaming edges into chunks and maintains forward and backward buffers that are incrementally updated and then merged to answer without deleting edges from the index. The framework leverages incremental connectivity via Union-Find Trees in buffers, snapshot isolation with augmented trees, and a backward-forward bipartite graph (BFBG) to support efficient inter-buffer merging. Theoretical analysis shows near query time and amortized updates, while experiments on eight real and two synthetic graphs demonstrate substantial throughput gains (up to 14x) and dramatic tail-latency reductions (up to 3900x) over state-of-the-art baselines, validating BIC's practicality for real-time graph analytics.

Abstract

We study index-based processing for connectivity queries within sliding windows on streaming graphs. These queries, which determine whether two vertices belong to the same connected component, are fundamental operations in real-time graph data processing and demand high throughput and low latency. While indexing methods that leverage data structures for fully dynamic connectivity can facilitate efficient query processing, they encounter significant challenges with deleting expired edges from the window during window updates. We introduce a novel indexing approach that eliminates the need for physically performing edge deletions. This is achieved through a unique bidirectional incremental computation framework, referred to as the BIC model. The BIC model implements two distinct incremental computations to compute connected components within the window, operating along and against the timeline, respectively. These computations are then merged to efficiently compute queries in the window. We propose techniques for optimized index storage, incremental index updates, and efficient query processing to improve BIC effectiveness. Empirically, BIC achieves a 14 increase in throughput and a reduction in P95 latency by up to 3900 when compared to state-of-the-art indexes.
Paper Structure (25 sections, 2 theorems, 2 equations, 12 figures, 1 table, 5 algorithms)

This paper contains 25 sections, 2 theorems, 2 equations, 12 figures, 1 table, 5 algorithms.

Key Result

lemma 1

The worst-case time complexity of performing find in an optimized UFT is $O(\log(|UFT|))$.

Figures (12)

  • Figure 1: Ruining example.
  • Figure 2: Using BIC for the running example in Figure \ref{['fig:sliding-window-connectivity']}.
  • Figure 3: The forward buffer $f_2$ over chunk $c_2$ and the backward buffer $b_1$ over chunk $c_1$ in the running example in Figure \ref{['fig:bic']}.
  • Figure 4: Storing $b_1$ in Figure \ref{['fig:backward_forward_buffer']} using snapshot isolation.
  • Figure 5: The snapshots of BFBG for $b_1$ and $f_2$ in Figure \ref{['fig:backward_buffer_labeling']}.
  • ...and 7 more figures

Theorems & Definitions (9)

  • definition 1: Sliding Window Connectivity
  • definition 2: Chunks
  • definition 3: Backward and Forward Buffers
  • definition 4: The BIC Model
  • definition 5: Optimized UFT
  • lemma 1
  • lemma 2
  • definition 6
  • definition 7