Table of Contents
Fetching ...

Jet: Multilevel Graph Partitioning on Graphics Processing Units

Michael S. Gilbert, Kamesh Madduri, Erik G. Boman, Sivasankaran Rajamanickam

TL;DR

Jet presents a novel GPU-accelerated multilevel graph partition refinement, combining a two-phase LP-based refinement (Jetlp) with a rebalancing phase (Jetr) to achieve high-quality $k$-way partitions while maintaining balance. The approach leverages a GPU-friendly coarsening strategy and performance-portable kernels via Kokkos, enabling fast runtimes across diverse graph classes. Empirical results show Jet delivers competitive or superior cut quality to CPU-based refiners in most cases, and substantial speedups, especially on irregular graphs; weaknesses appear for 2D-structured meshes and web-graph-like datasets. These findings indicate that Jet is a practical, high-performance refinement engine for GPU-based multilevel partitioning, with potential for further gains in distributed memory settings.

Abstract

The multilevel heuristic is the dominant strategy for high-quality sequential and parallel graph partitioning. Partition refinement is a key step of multilevel graph partitioning. In this work, we present Jet, a new parallel algorithm for partition refinement specifically designed for Graphics Processing Units (GPUs). We combine Jet with GPU-aware coarsening to develop a $k$-way graph partitioner, the Jet partitioner. The new partitioner achieves superior quality compared to state-of-the-art shared memory partitioners on a large collection of test graphs.

Jet: Multilevel Graph Partitioning on Graphics Processing Units

TL;DR

Jet presents a novel GPU-accelerated multilevel graph partition refinement, combining a two-phase LP-based refinement (Jetlp) with a rebalancing phase (Jetr) to achieve high-quality -way partitions while maintaining balance. The approach leverages a GPU-friendly coarsening strategy and performance-portable kernels via Kokkos, enabling fast runtimes across diverse graph classes. Empirical results show Jet delivers competitive or superior cut quality to CPU-based refiners in most cases, and substantial speedups, especially on irregular graphs; weaknesses appear for 2D-structured meshes and web-graph-like datasets. These findings indicate that Jet is a practical, high-performance refinement engine for GPU-based multilevel partitioning, with potential for further gains in distributed memory settings.

Abstract

The multilevel heuristic is the dominant strategy for high-quality sequential and parallel graph partitioning. Partition refinement is a key step of multilevel graph partitioning. In this work, we present Jet, a new parallel algorithm for partition refinement specifically designed for Graphics Processing Units (GPUs). We combine Jet with GPU-aware coarsening to develop a -way graph partitioner, the Jet partitioner. The new partitioner achieves superior quality compared to state-of-the-art shared memory partitioners on a large collection of test graphs.
Paper Structure (42 sections, 1 theorem, 11 equations, 4 figures, 6 tables, 6 algorithms)

This paper contains 42 sections, 1 theorem, 11 equations, 4 figures, 6 tables, 6 algorithms.

Key Result

Theorem 1

\newlabelth:loss0 Let $L'_x$ be the prefix of $L'$ that minimizes equation eq:prefix. In a graph with uniform vertex weights, and assuming the number of vertices with negative loss is negligible, we have the following inequality:

Figures (4)

  • Figure 1: We use performance profiles to compare cutsize obtained using our partitioner to others.
  • Figure 2: Cutsize Results By Class
  • Figure 3: We use performance profiles to compare partitioning time of the Jet partitioner (Ours) on the A100 GPU to the execution time of other partitioners. The other partitioners are executed on the AMD Ryzen Threadripper 3970x CPU.
  • Figure 4: Partitioning Time Comparison

Theorems & Definitions (1)

  • Theorem 1