Table of Contents
Fetching ...

A Parallel and Distributed Rust Library for Core Decomposition on Large Graphs

Davide Rucci, Sebastian Parfeniuc, Matteo Mordacchini, Emanuele Carlini, Alfredo Cuzzocrea, Patrizio Dazzi

TL;DR

This work addresses scalable k-core decomposition on very large graphs by adapting Montresor's decentralized protocol to a shared-memory Rust library. It introduces three implementations—SequentialK, ParallelK, and FastK—with FastK employing cache-friendly data structures, selective messaging, and activation scheduling to maximize throughput. Empirical results on real-world datasets show FastK achieving up to 11x speedups on 16 cores and outperforming NetworkX by up to two orders of magnitude, validating Rust's suitability for high-performance graph analytics. Overall, the paper demonstrates that a carefully engineered, memory-safe Rust solution can deliver practical, scalable core-decomposition capabilities and provides reusable building blocks for parallel/distributed graph algorithms.

Abstract

In this paper, we investigate the parallelization of $k$-core decomposition, a method used in graph analysis to identify cohesive substructures and assess node centrality. Although efficient sequential algorithms exist for this task, the scale of modern networks requires faster, multicore-ready approaches. To this end, we adapt a distributed $k$-core algorithm originally proposed by Montresor et al. to shared-memory systems and implement it in Rust, leveraging the language's strengths in concurrency and memory safety. We developed three progressively optimized versions: SequentialK as a baseline, ParallelK introducing multi-threaded message passing, and FastK further reducing synchronization overhead. Extensive experiments on real-world datasets, including road networks, web graphs, and social networks, show that FastK consistently outperforms both SequentialK and ParallelK, as well as a reference Python implementation available in the NetworkX library. Results indicate up to an 11x speedup on 16 threads and execution times up to two orders of magnitude faster than the Python implementation.

A Parallel and Distributed Rust Library for Core Decomposition on Large Graphs

TL;DR

This work addresses scalable k-core decomposition on very large graphs by adapting Montresor's decentralized protocol to a shared-memory Rust library. It introduces three implementations—SequentialK, ParallelK, and FastK—with FastK employing cache-friendly data structures, selective messaging, and activation scheduling to maximize throughput. Empirical results on real-world datasets show FastK achieving up to 11x speedups on 16 cores and outperforming NetworkX by up to two orders of magnitude, validating Rust's suitability for high-performance graph analytics. Overall, the paper demonstrates that a carefully engineered, memory-safe Rust solution can deliver practical, scalable core-decomposition capabilities and provides reusable building blocks for parallel/distributed graph algorithms.

Abstract

In this paper, we investigate the parallelization of -core decomposition, a method used in graph analysis to identify cohesive substructures and assess node centrality. Although efficient sequential algorithms exist for this task, the scale of modern networks requires faster, multicore-ready approaches. To this end, we adapt a distributed -core algorithm originally proposed by Montresor et al. to shared-memory systems and implement it in Rust, leveraging the language's strengths in concurrency and memory safety. We developed three progressively optimized versions: SequentialK as a baseline, ParallelK introducing multi-threaded message passing, and FastK further reducing synchronization overhead. Extensive experiments on real-world datasets, including road networks, web graphs, and social networks, show that FastK consistently outperforms both SequentialK and ParallelK, as well as a reference Python implementation available in the NetworkX library. Results indicate up to an 11x speedup on 16 threads and execution times up to two orders of magnitude faster than the Python implementation.

Paper Structure

This paper contains 23 sections, 7 figures, 2 tables, 4 algorithms.

Figures (7)

  • Figure 1: Runtime with respect to batch size for the ParallelK implementation.
  • Figure 2: Runtime comparison between implementation with hashmaps and with sorted vectors.
  • Figure 3: Runtime comparison among different parallelization strategies in ParallelK.
  • Figure 4: Average distance from nodes' true coreness value and the current estimate during the execution of FastK.
  • Figure 5: Percentage of activated nodes during the execution of FastK.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 1: $k$-core SEIDMAN1983269