BuffCut: Prioritized Buffered Streaming Graph Partitioning

Linus Baumgärtner; Adil Chhabra; Marcelo Fonseca Faraj; Christian Schulz

BuffCut: Prioritized Buffered Streaming Graph Partitioning

Linus Baumgärtner, Adil Chhabra, Marcelo Fonseca Faraj, Christian Schulz

TL;DR

This work presents BuffCut, a buffered streaming partitioner that narrows this quality gap, particularly when stream ordering is adversarial, by combining prioritized buffering with batch-wise multilevel assignment.

Abstract

Streaming graph partitioners enable resource-efficient and massively scalable partitioning, but one-pass assignment heuristics are highly sensitive to stream order and often yield substantially higher edge cuts than in-memory methods. We present BuffCut, a buffered streaming partitioner that narrows this quality gap, particularly when stream ordering is adversarial, by combining prioritized buffering with batch-wise multilevel assignment. BuffCut maintains a bounded priority buffer to delay poorly informed decisions and regulate the order in which nodes are considered for assignment. It incrementally constructs high-locality batches of configurable size by iteratively inserting the highest-priority nodes from the buffer into the batch, effectively recovering locality structure from the stream. Each batch is then assigned via a multilevel partitioning algorithm. Experiments on diverse real-world and synthetic graphs show that BuffCut consistently outperforms state-of-the-art buffered streaming methods. Compared to the strongest prioritized buffering baseline, BuffCut achieves 20.8% fewer edge cuts while running 2.9 times faster and using 11.3 times less memory. Against the next-best buffered method, it reduces edge cut by 15.8% with only modest overheads of 1.8 times runtime and 1.09 times memory.

BuffCut: Prioritized Buffered Streaming Graph Partitioning

TL;DR

Abstract

Paper Structure (15 sections, 3 equations, 8 figures, 3 tables, 2 algorithms)

This paper contains 15 sections, 3 equations, 8 figures, 3 tables, 2 algorithms.

Introduction
Preliminaries
Basic Concepts
Related Work
BuffCut: Prioritized Buffered Streaming Partitioning
Overall Algorithm
Prioritized Buffered Streaming
Buffer Scoring Functions
Batch-Wise Multilevel Partitioning
Parallelization and Restreaming
Experimental Analysis
Prioritized Buffering and Batch-Wise Partitioning
Parallelization and Restreaming
Comparison to State-of-the-Art
Conclusion

Figures (8)

Figure 1: Edge cut on source and random ordering. The ratio of cut edges to total graph edges (%) on uk-2007-05 web graph ($k$=16) comparing source ordering (original node sequence in source file) to random ordering (independent random permutations of node IDs resulting in adversarial stream order) for HeiStream, Cuttana and our proposed BuffCut.
Figure 2: Overview of BuffCut and its parallel pipeline. Nodes arrive as a stream (left). Low-degree nodes are scored and inserted into a bounded prioritized buffer $\mathcal{Q}$, while hubs bypass the buffer. When $|\mathcal{Q}|=\mathcal{Q}_{\max}$, the highest-priority node is evicted to incrementally grow a batch $\mathcal{B}$ (middle); once $|\mathcal{B}|=\delta$, BuffCut constructs the batch model graph and partitions it via multilevel refinement, committing assignments (right). After committing, the batch $\mathcal{B}$ is cleared and construction resumes with the next evictions. The three stages are overlapped using an I/O reader, buffer handler, and partition worker.
Figure 3: Heatmap visualizations of CBS and HAA as a function of degree (x-axis) and assigned-neighbor ratio (y-axis). Bright colors indicate high scores, which lead to earlier eviction from the buffer.
Figure 4: Impact of buffer score on cut quality (Tuning Set, random order, $k=32$). Geometric means of edge cut relative to ANR (lower is better).
Figure 5: Effect of buffer size $\mathcal{Q}_{\max}$ (Tuning Set, random order, $k=32$). Larger buffers increase within-batch locality (IER) and reduce cut at increased memory cost.
...and 3 more figures

BuffCut: Prioritized Buffered Streaming Graph Partitioning

TL;DR

Abstract

BuffCut: Prioritized Buffered Streaming Graph Partitioning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)