Table of Contents
Fetching ...

Efficient Influence Minimization via Node Blocking

Jinghao Wang, Yanping Wu, Xiaoyang Wang, Ying Zhang, Lu Qin, Wenjie Zhang, Xuemin Lin

TL;DR

The paper tackles Influence Minimization via node blocking (IMIN) under the Independent Cascade model, a non-submodular, NP-hard problem for which prior work offered no non-trivial guarantees and faced high computational cost.It introduces SandIMIN, a Sandwich-framework method that builds monotone submodular lower and upper bounds for the non-submodular objective and uses novel sampling and martingale concentration techniques to achieve $(1-1/e- ext{} abla psilon)$-approximation guarantees with high probability for bounding functions.Two novel estimators are developed: Local Sampling with CP sequences to estimate the lower bound and Global Sampling with Local Reverse Reachable sets to estimate the upper bound, enabling scalable estimation on large networks.Two non-trivial algorithms, LSBM and GSBM, maximize the bounding functions with data-dependent guarantees, complemented by a lightweight heuristic LHGA to fill the Sandwich, with SandIMIN selecting the best among these components.Extensive experiments on nine real-world graphs show SandIMIN achieves up to two orders of magnitude speedups over state-of-the-art methods while maintaining competitive decrease in misinformation spread, establishing practical applicability for large-scale IMIN tasks.

Abstract

Given a graph G, a budget k and a misinformation seed set S, Influence Minimization (IMIN) via node blocking aims to find a set of k nodes to be blocked such that the expected spread of S is minimized. This problem finds important applications in suppressing the spread of misinformation and has been extensively studied in the literature. However, existing solutions for IMIN still incur significant computation overhead, especially when k becomes large. In addition, there is still no approximation solution with non-trivial theoretical guarantee for IMIN via node blocking prior to our work. In this paper, we conduct the first attempt to propose algorithms that yield data-dependent approximation guarantees. Based on the Sandwich framework, we first develop submodular and monotonic lower and upper bounds for our non-submodular objective function and prove the computation of proposed bounds is \#P-hard. In addition, two advanced sampling methods are proposed to estimate the value of bounding functions. Moreover, we develop two novel martingale-based concentration bounds to reduce the sample complexity and design two non-trivial algorithms that provide (1-1/e-ε)-approximate solutions to our bounding functions. Comprehensive experiments on 9 real-world datasets are conducted to validate the efficiency and effectiveness of the proposed techniques. Compared with the state-of-the-art methods, our solutions can achieve up to two orders of magnitude speedup and provide theoretical guarantees for the quality of returned results.

Efficient Influence Minimization via Node Blocking

TL;DR

The paper tackles Influence Minimization via node blocking (IMIN) under the Independent Cascade model, a non-submodular, NP-hard problem for which prior work offered no non-trivial guarantees and faced high computational cost.It introduces SandIMIN, a Sandwich-framework method that builds monotone submodular lower and upper bounds for the non-submodular objective and uses novel sampling and martingale concentration techniques to achieve $(1-1/e- ext{} abla psilon)$-approximation guarantees with high probability for bounding functions.Two novel estimators are developed: Local Sampling with CP sequences to estimate the lower bound and Global Sampling with Local Reverse Reachable sets to estimate the upper bound, enabling scalable estimation on large networks.Two non-trivial algorithms, LSBM and GSBM, maximize the bounding functions with data-dependent guarantees, complemented by a lightweight heuristic LHGA to fill the Sandwich, with SandIMIN selecting the best among these components.Extensive experiments on nine real-world graphs show SandIMIN achieves up to two orders of magnitude speedups over state-of-the-art methods while maintaining competitive decrease in misinformation spread, establishing practical applicability for large-scale IMIN tasks.

Abstract

Given a graph G, a budget k and a misinformation seed set S, Influence Minimization (IMIN) via node blocking aims to find a set of k nodes to be blocked such that the expected spread of S is minimized. This problem finds important applications in suppressing the spread of misinformation and has been extensively studied in the literature. However, existing solutions for IMIN still incur significant computation overhead, especially when k becomes large. In addition, there is still no approximation solution with non-trivial theoretical guarantee for IMIN via node blocking prior to our work. In this paper, we conduct the first attempt to propose algorithms that yield data-dependent approximation guarantees. Based on the Sandwich framework, we first develop submodular and monotonic lower and upper bounds for our non-submodular objective function and prove the computation of proposed bounds is \#P-hard. In addition, two advanced sampling methods are proposed to estimate the value of bounding functions. Moreover, we develop two novel martingale-based concentration bounds to reduce the sample complexity and design two non-trivial algorithms that provide (1-1/e-ε)-approximate solutions to our bounding functions. Comprehensive experiments on 9 real-world datasets are conducted to validate the efficiency and effectiveness of the proposed techniques. Compared with the state-of-the-art methods, our solutions can achieve up to two orders of magnitude speedup and provide theoretical guarantees for the quality of returned results.
Paper Structure (22 sections, 11 theorems, 18 equations, 22 figures, 5 tables, 4 algorithms)

This paper contains 22 sections, 11 theorems, 18 equations, 22 figures, 5 tables, 4 algorithms.

Key Result

lemma 1

Given a seed set $S$ and its unified seed node $s$, $D_s^L(\cdot)$ is monotone nondecreasing and submodular under the IC model.

Figures (22)

  • Figure 1: Social network $G$
  • Figure 2: A realization $\phi$
  • Figure 4: SandIMIN
  • Figure 5: Example of the bounds
  • Figure 6: Local Sampling
  • ...and 17 more figures

Theorems & Definitions (16)

  • definition 1: Dominator
  • definition 2: Immediate Dominator
  • definition 3: Dominator Tree (DT)
  • lemma 1
  • definition 4: Common Path (CP) Set & Common Path (CP) Sequence
  • lemma 2
  • definition 5: Local Reverse Reachable (LRR) Set
  • lemma 3
  • lemma 4: Concentration Bounds
  • lemma 5
  • ...and 6 more