Buffered Streaming Edge Partitioning

Adil Chhabra; Marcelo Fonseca Faraj; Christian Schulz; Daniel Seemaier

Buffered Streaming Edge Partitioning

Adil Chhabra, Marcelo Fonseca Faraj, Christian Schulz, Daniel Seemaier

TL;DR

This work tackles edge partitioning for massive graphs by introducing two buffered streaming algorithms, HeiStreamE and FreightE. HeiStreamE leverages a CSPAC-based batch model and a multilevel Fennel partitioner to achieve high-quality partitions with time and memory linear in the graph size and independent of the number of blocks $k$, while FreightE uses on-the-fly hypergraph partitioning to assign edges rapidly without CSPAC construction. The authors provide a detailed model construction, batch processing strategy, and three modes for connectivity-aware batch modeling, along with extensive parameter tuning and a comprehensive comparison against HDRF and 2PS variants. Empirical results show HeiStreamE generally outperforms competing streaming methods in replication factor and remains memory-efficient for real-world, edge-rich graphs, whereas FreightE delivers exceptionally fast partitioning, especially for large $k$, making the approaches practical for large-scale graph processing systems.

Abstract

Addressing the challenges of processing massive graphs, which are prevalent in diverse fields such as social, biological, and technical networks, we introduce HeiStreamE and FreightE, two innovative (buffered) streaming algorithms designed for efficient edge partitioning of large-scale graphs. HeiStreamE utilizes an adapted Split-and-Connect graph model and a Fennel-based multilevel partitioning scheme, while FreightE partitions a hypergraph representation of the input graph. Besides ensuring superior solution quality, these approaches also overcome the limitations of existing algorithms by maintaining linear dependency on the graph size in both time and memory complexity with no dependence on the number of blocks of partition. Our comprehensive experimental analysis demonstrates that HeiStreamE outperforms current streaming algorithms and the re-streaming algorithm 2PS in partitioning quality (replication factor), and is more memory-efficient for real-world networks where the number of edges is far greater than the number of vertices. Further, FreightE is shown to produce fast and efficient partitions, particularly for higher numbers of partition blocks.

Buffered Streaming Edge Partitioning

TL;DR

, while FreightE uses on-the-fly hypergraph partitioning to assign edges rapidly without CSPAC construction. The authors provide a detailed model construction, batch processing strategy, and three modes for connectivity-aware batch modeling, along with extensive parameter tuning and a comprehensive comparison against HDRF and 2PS variants. Empirical results show HeiStreamE generally outperforms competing streaming methods in replication factor and remains memory-efficient for real-world, edge-rich graphs, whereas FreightE delivers exceptionally fast partitioning, especially for large

, making the approaches practical for large-scale graph processing systems.

Abstract

Paper Structure (21 sections, 2 theorems, 2 equations, 6 figures, 2 tables, 1 algorithm)

This paper contains 21 sections, 2 theorems, 2 equations, 6 figures, 2 tables, 1 algorithm.

Introduction
Preliminaries
Basic Concepts
Related Work
Buffered Streaming Edge Partitioning
Overall Algorithm
Input and Batch Format
Model Construction
Partitioning
FreightE
Experimental Evaluation
Parameter Tuning
k-Independent Initial Partitioning.
Graph Model Mode.
Fennel Alpha.
...and 6 more sections

Key Result

Theorem 1

For any vertex partition $vp(S^*_b)$ of the CSPAC graph $S^*_b$ with edge-cut $cost(vp(S^*_b))$, there exists a corresponding edge partition $ep(G_b)$ of the batch graph $G_b$ with a number of vertex replicas $cost(ep(G_b))$, satisfying $cost(ep(G_b)) \leq cost(vp(S^*_b))$, which establishes a lower

Figures (6)

Figure 1: Multilevel scheme.
Figure 2: Detailed structure of HeiStreamE. The algorithm starts by loading a batch graph $G_b$ consisting of vertices and their edges to the current batch and previous batches. Subsequently, it builds a meaningful model $\beta_b$ from the batch graph, transforming edges into vertices, and incorporating a synthetic representation of the assignments made in previous batches. This model is then partitioned using a multilevel algorithm. Lastly, the edges from the loaded batch, which correspond to vertices in the partitioned batch model, are permanently assigned to blocks. This process is repeated for subsequent batches until the entire graph has been partitioned.
Figure 3: Building SPAC and CSPAC Graphs: The SPAC graph, denoted as $G'$, features $d(v)$ split vertices for every vertex $v$ in the original graph $G$. These split vertices are represented in the same color as the vertex they originate from. Every (thick green) edge from $G$ is directly converted into a distinct dominant (thick green) edge in $G'$ that connects corresponding split vertices. The auxiliary (thin) edges in $G'$, which create a path between split vertices, are depicted in the same color as the split vertices they link. The CSPAC graph $G^*$ is formed by contracting the dominant edges of $G'$. The vertices in $G^*$ represent the edges in $G$, while the edges in $G^*$ mirror the auxiliary edges in $G'$.
Figure 4: Graph model $\beta_b$ construction. $\beta_b$ is obtained by appending past assignment decisions to $S^*_b$. If a vertex of the current batch graph $u \in G_b$ has an edge $e = (u, v)$ to a previous batch (colored blue), we connect the CSPAC vertex $u^*$ induced by $e$ to artificial vertices representing blocks assigned to edges incident on $v$ as follows: (a) Maximal Mode: $u^*$ connects to all blocks incident on $v$ (b) $r$-Subset Mode: $u^*$ connects to $r$ random blocks incident on $v$ (c) Minimal Mode: $u^*$ connects to the block assigned to the most recently partitioned edge incident on $v$.
Figure 5: Comparison of HeiStreamE and FreightE with 2PS-HDRF, 2PS-L and HDRF on the Test Set in Appendix Table \ref{['table:graph']} using performance profiles. Let $\mathcal{A}$ be the set of all algorithms, $\mathcal{I}$ the set of instances, and $q_A(I)$ the quality of algorithm $A \in \mathcal{A}$ on instance $I \in \mathcal{I}$. For each algorithm $A$, we plot the fraction of instances $\frac{|\mathcal{I}_A(\tau)|}{|\mathcal{I}|}$ (y-axis) where $\mathcal{I}_A(\tau) := \left\{ I \in \mathcal{I} | q_A \leq \tau \cdot min_{A' \in \mathcal{A}}q_{A'}(I)\right\}$ and $\tau$ is on the x-axis. Includes all $k$ values. Note the logarithmic scale in the final third of the plots. Memory consumption is measured as the maximum resident set size of the program execution.
...and 1 more figures

Theorems & Definitions (2)

Theorem 1
Theorem 2

Buffered Streaming Edge Partitioning

TL;DR

Abstract

Buffered Streaming Edge Partitioning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (2)