Low-Latency Sliding Window Connectivity
Chao Zhang, Angela Bonifati, Tamer Özsu
TL;DR
This work tackles low-latency connectivity queries on streaming graphs under sliding windows by introducing a maximum spanning tree (MST) based indexing framework. Each window maintains one MST per connected component, enabling fast queries and efficient updates while completely eliminating the expensive replacement-edge searches that plague traditional fully dynamic connectivity approaches. By integrating diverse FDC structures (D-Tree, Link-Cut Tree) into the MST framework, the authors achieve amortized $O(\log n)$ time in the best cases and report dramatic improvements in query latency (up to $1172\times$) and throughput (up to $80\times$) with substantially lower memory usage across real and synthetic datasets. The proposed OMST variants demonstrate robust performance across varying workloads, window sizes, and slide intervals, making the approach practical for real-time streaming graph analytics.
Abstract
Connectivity queries, which check whether vertices belong to the same connected component, are fundamental in graph computations. Sliding window connectivity processes these queries over sliding windows, facilitating real-time streaming graph analytics. However, existing methods struggle with low-latency processing due to the significant overhead of continuously updating index structures as edges are inserted and deleted. We introduce a novel approach that leverages spanning trees to efficiently process queries. The novelty of this method lies in its ability to maintain spanning trees efficiently as window updates occur. Notably, our approach completely eliminates the need for replacement edge searches, a traditional bottleneck in managing spanning trees during edge deletions. We also present several optimizations to maximize the potential of spanning-tree-based indexes. Our comprehensive experimental evaluation shows that index update latency in spanning trees can be reduced by up to $458\times$ while maintaining query performance, leading to an $8\times$ improvement in throughput. Our approach also significantly outperforms the state-of-the-art in both query processing and index updates. Additionally, our methods use significantly less memory and demonstrate consistent efficiency across various settings.
