Table of Contents
Fetching ...

FS_GPlib: Breaking the Web-Scale Barrier - A Unified Acceleration Framework for Graph Propagation Models

Chang Guo, Juyuan Zhang, Chang Su, Tianlong Fan, Linyuan Lü

Abstract

Propagation models are essential for modeling and simulating dynamic processes such as epidemics and information diffusion. However, existing tools struggle to scale to large-scale graphs that emerge across social networks, epidemic networks and so on, due to limited algorithmic efficiency, weak scalability, and high communication overhead. We present FS_GPlib, a unified library that enables efficient, high-fidelity propagation modeling on Web-scale graphs. FS_GPlib introduces a dual-acceleration framework: it combines micro-level synchronous message-passing updates with macro-level batched Monte Carlo simulation, leveraging high-dimensional tensor operations for parallel execution. To further enhance scalability, it supports distributed simulation via a novel target-node-based graph partitioning strategy that minimizes communication overhead while maintaining load balance. Theoretically, we show that under ideal assumptions, the runtime of simulations converges approximately to a constant. Extensive experiments demonstrate up to 35,000 times speedup over standard libraries such as NDlib and execution of a full Monte Carlo simulation on a Web-scale (billion-edge) graph in 11 seconds while maintaining high simulation fidelity. FS_GPlib supports 29 propagation models-including epidemic and opinion dynamics and dynamic network models-and offers a lightweight Python API compatible with mainstream data science ecosystems. By addressing the unique challenges of modeling diffusion and cascades on the Web, FS_GPlib provides a scalable, extensible, and theoretically grounded solution for large-scale propagation analysis in epidemiology, social media analysis, and online network dynamics. Code available at: https://github.com/Allen-Ciel/FS_GPlib.

FS_GPlib: Breaking the Web-Scale Barrier - A Unified Acceleration Framework for Graph Propagation Models

Abstract

Propagation models are essential for modeling and simulating dynamic processes such as epidemics and information diffusion. However, existing tools struggle to scale to large-scale graphs that emerge across social networks, epidemic networks and so on, due to limited algorithmic efficiency, weak scalability, and high communication overhead. We present FS_GPlib, a unified library that enables efficient, high-fidelity propagation modeling on Web-scale graphs. FS_GPlib introduces a dual-acceleration framework: it combines micro-level synchronous message-passing updates with macro-level batched Monte Carlo simulation, leveraging high-dimensional tensor operations for parallel execution. To further enhance scalability, it supports distributed simulation via a novel target-node-based graph partitioning strategy that minimizes communication overhead while maintaining load balance. Theoretically, we show that under ideal assumptions, the runtime of simulations converges approximately to a constant. Extensive experiments demonstrate up to 35,000 times speedup over standard libraries such as NDlib and execution of a full Monte Carlo simulation on a Web-scale (billion-edge) graph in 11 seconds while maintaining high simulation fidelity. FS_GPlib supports 29 propagation models-including epidemic and opinion dynamics and dynamic network models-and offers a lightweight Python API compatible with mainstream data science ecosystems. By addressing the unique challenges of modeling diffusion and cascades on the Web, FS_GPlib provides a scalable, extensible, and theoretically grounded solution for large-scale propagation analysis in epidemiology, social media analysis, and online network dynamics. Code available at: https://github.com/Allen-Ciel/FS_GPlib.
Paper Structure (29 sections, 4 theorems, 20 equations, 8 figures, 11 tables, 1 algorithm)

This paper contains 29 sections, 4 theorems, 20 equations, 8 figures, 11 tables, 1 algorithm.

Key Result

proposition 1

Let $w_{\max}=\max_j w_j$. For each partition $q$, the edge load satisfies thus the workload deviation from the ideal average $W/d$ is bounded by at most the maximum in-degree.

Figures (8)

  • Figure 1: Architecture of the dual-acceleration propagation framework based on micro-level message passing and macro-level batched Monte Carlo computations.
  • Figure 2: Distributed propagation simulation strategy with target-node-based partitioning.
  • Figure 3: Monte Carlo simulation results for three IC implementations across four datasets, highlighting FS_GPlib’s consistency and accuracy.
  • Figure 4: Monte Carlo simulation results for three SIR implementations across datasets, illustrating FS_GPlib’s consistency and accuracy.
  • Figure 5: Speedup of IC implementations over NDlib.
  • ...and 3 more figures

Theorems & Definitions (14)

  • definition 1: Graph
  • definition 2: Compressed Sparse Row, CSR
  • definition 3: Mechanism-Based Propagation Model
  • definition 4: Paradigm of Mechanism-Based Propagation Model
  • definition 5: Node-Synchronized Update Method
  • definition 6: Batched Monte Carlo Method
  • definition 7: Target-Node-Based Partitioning via Longest Processing Time Heuristic
  • proposition 1: Load Balance Guarantee
  • Remark 1: Communication Efficiency
  • theorem 1: Constant-Time Micro-Level Node Synchronization
  • ...and 4 more