Table of Contents
Fetching ...

Enhancing Scalability and Performance in Influence Maximization with Optimized Parallel Processing

Hanjiang Wu, Huan Xu, Joongun Park, Jesmin Jahan Tithi, Fabio Checconi, Jordi Wolfson-Pou, Fabrizio Petrini, Tushar Krishna

TL;DR

EFFICIENTIMM is proposed, a state-of-the-art Influence Maximization via Martingales (IMM) implementation with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures, to enhance the work efficiency of the current IMM implementation.

Abstract

Influence Maximization (IM) is vital in viral marketing and biological network analysis for identifying key influencers. Given its NP-hard nature, approximate solutions are employed. This paper addresses scalability challenges in scale-out shared memory system by focusing on the state-of-the-art Influence Maximization via Martingales (IMM) benchmark. To enhance the work efficiency of the current IMM implementation, we propose EFFICIENTIMM with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures. Benchmarking on a 128-core CPU system with 8 NUMA nodes, EFFICIENTIMM demonstrated significant performance improvements, achieving an average 5.9x speedup over Ripples across 8 diverse SNAP datasets, when compared to the best execution times of the original Ripples framework. Additionally, on the Youtube graph, EFFICIENTIMM demonstrates a better memory access pattern with 357.4x reduction in L1+L2 cache misses as compared to Ripples.

Enhancing Scalability and Performance in Influence Maximization with Optimized Parallel Processing

TL;DR

EFFICIENTIMM is proposed, a state-of-the-art Influence Maximization via Martingales (IMM) implementation with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures, to enhance the work efficiency of the current IMM implementation.

Abstract

Influence Maximization (IM) is vital in viral marketing and biological network analysis for identifying key influencers. Given its NP-hard nature, approximate solutions are employed. This paper addresses scalability challenges in scale-out shared memory system by focusing on the state-of-the-art Influence Maximization via Martingales (IMM) benchmark. To enhance the work efficiency of the current IMM implementation, we propose EFFICIENTIMM with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures. Benchmarking on a 128-core CPU system with 8 NUMA nodes, EFFICIENTIMM demonstrated significant performance improvements, achieving an average 5.9x speedup over Ripples across 8 diverse SNAP datasets, when compared to the best execution times of the original Ripples framework. Additionally, on the Youtube graph, EFFICIENTIMM demonstrates a better memory access pattern with 357.4x reduction in L1+L2 cache misses as compared to Ripples.

Paper Structure

This paper contains 19 sections, 3 equations, 7 figures, 4 tables, 3 algorithms.

Figures (7)

  • Figure 1: Ripples Strong Scaling Performance
  • Figure 2: Ripples Runtime Breakdown
  • Figure 3: Counter updates pattern in our proposed algorithm.
  • Figure 4: System architecure and NUMA-aware data structures
  • Figure 5: Runtime Comparison w/ and w/o Adaptive Update when using 128 cores
  • ...and 2 more figures