Enhancing Scalability and Performance in Influence Maximization with Optimized Parallel Processing
Hanjiang Wu, Huan Xu, Joongun Park, Jesmin Jahan Tithi, Fabio Checconi, Jordi Wolfson-Pou, Fabrizio Petrini, Tushar Krishna
TL;DR
EFFICIENTIMM is proposed, a state-of-the-art Influence Maximization via Martingales (IMM) implementation with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures, to enhance the work efficiency of the current IMM implementation.
Abstract
Influence Maximization (IM) is vital in viral marketing and biological network analysis for identifying key influencers. Given its NP-hard nature, approximate solutions are employed. This paper addresses scalability challenges in scale-out shared memory system by focusing on the state-of-the-art Influence Maximization via Martingales (IMM) benchmark. To enhance the work efficiency of the current IMM implementation, we propose EFFICIENTIMM with key strategies, including new parallelization scheme, NUMA-aware memory usage, dynamic load balancing and fine-grained adaptive data structures. Benchmarking on a 128-core CPU system with 8 NUMA nodes, EFFICIENTIMM demonstrated significant performance improvements, achieving an average 5.9x speedup over Ripples across 8 diverse SNAP datasets, when compared to the best execution times of the original Ripples framework. Additionally, on the Youtube graph, EFFICIENTIMM demonstrates a better memory access pattern with 357.4x reduction in L1+L2 cache misses as compared to Ripples.
