Table of Contents
Fetching ...

Graph Bayesian Optimization for Multiplex Influence Maximization

Zirui Yuan, Minglai Shao, Zhiqian Chen

TL;DR

This work addresses multiplex influence maximization (Multi-IM) where multiple information items propagate and interact on a multiplex network. It introduces GBIM, a Graph Bayesian Optimization framework that learns a scalable surrogate via a global kernelized attention message passing module and performs acquisition-driven seed selection with Bayesian linear regression to quantify uncertainty. Extensive experiments on real-world networks and synthetic data under Multi-LT and Multi-IC diffusion show substantial gains over traditional IM methods (e.g., IMM, CELF++) and other baselines, with notable improvements such as >$40\%$ gains on LastFM. The approach provides a practical, scalable solution for multi-item campaigns and lays groundwork for richer modeling of item-item relations in diffusion processes.

Abstract

Influence maximization (IM) is the problem of identifying a limited number of initial influential users within a social network to maximize the number of influenced users. However, previous research has mostly focused on individual information propagation, neglecting the simultaneous and interactive dissemination of multiple information items. In reality, when users encounter a piece of information, such as a smartphone product, they often associate it with related products in their minds, such as earphones or computers from the same brand. Additionally, information platforms frequently recommend related content to users, amplifying this cascading effect and leading to multiplex influence diffusion. This paper first formulates the Multiplex Influence Maximization (Multi-IM) problem using multiplex diffusion models with an information association mechanism. In this problem, the seed set is a combination of influential users and information. To effectively manage the combinatorial complexity, we propose Graph Bayesian Optimization for Multi-IM (GBIM). The multiplex diffusion process is thoroughly investigated using a highly effective global kernelized attention message-passing module. This module, in conjunction with Bayesian linear regression (BLR), produces a scalable surrogate model. A data acquisition module incorporating the exploration-exploitation trade-off is developed to optimize the seed set further. Extensive experiments on synthetic and real-world datasets have proven our proposed framework effective. The code is available at https://github.com/zirui-yuan/GBIM.

Graph Bayesian Optimization for Multiplex Influence Maximization

TL;DR

This work addresses multiplex influence maximization (Multi-IM) where multiple information items propagate and interact on a multiplex network. It introduces GBIM, a Graph Bayesian Optimization framework that learns a scalable surrogate via a global kernelized attention message passing module and performs acquisition-driven seed selection with Bayesian linear regression to quantify uncertainty. Extensive experiments on real-world networks and synthetic data under Multi-LT and Multi-IC diffusion show substantial gains over traditional IM methods (e.g., IMM, CELF++) and other baselines, with notable improvements such as > gains on LastFM. The approach provides a practical, scalable solution for multi-item campaigns and lays groundwork for richer modeling of item-item relations in diffusion processes.

Abstract

Influence maximization (IM) is the problem of identifying a limited number of initial influential users within a social network to maximize the number of influenced users. However, previous research has mostly focused on individual information propagation, neglecting the simultaneous and interactive dissemination of multiple information items. In reality, when users encounter a piece of information, such as a smartphone product, they often associate it with related products in their minds, such as earphones or computers from the same brand. Additionally, information platforms frequently recommend related content to users, amplifying this cascading effect and leading to multiplex influence diffusion. This paper first formulates the Multiplex Influence Maximization (Multi-IM) problem using multiplex diffusion models with an information association mechanism. In this problem, the seed set is a combination of influential users and information. To effectively manage the combinatorial complexity, we propose Graph Bayesian Optimization for Multi-IM (GBIM). The multiplex diffusion process is thoroughly investigated using a highly effective global kernelized attention message-passing module. This module, in conjunction with Bayesian linear regression (BLR), produces a scalable surrogate model. A data acquisition module incorporating the exploration-exploitation trade-off is developed to optimize the seed set further. Extensive experiments on synthetic and real-world datasets have proven our proposed framework effective. The code is available at https://github.com/zirui-yuan/GBIM.
Paper Structure (23 sections, 1 theorem, 21 equations, 5 figures, 1 table)

This paper contains 23 sections, 1 theorem, 21 equations, 5 figures, 1 table.

Key Result

Theorem 1

The surrogate model constructed by combining neural network basis functions $\phi(\boldsymbol{x})$ with Bayesian linear regression is a special case of Gaussian process regression with a linear kernel.

Figures (5)

  • Figure 1: The comparison of related IM problems, and $u_1$ is the initial seed user. (a) Canonical IM in the homogeneous network. (b) Heterogeneous IM on networks with multiple node and edge types. (c) Multi-layer IM with users in two social network platforms. (d) Multiplex IM with three information items $\{v_1, v_2, v_3\}$, and $(u_1, v_1)$ is the seed.
  • Figure 2: The overview of the proposed GBIM framework. This framework includes two modules: surrogate model and data acquisition. Initially, random seeds are evaluated by the true model $\mathcal{M}(\boldsymbol{x})$ to form dataset $\mathcal{D}$. We then iteratively: (1) train surrogate model $\mathcal{M}^*(\boldsymbol{x})$ on $\mathcal{D}$; (2) sampling and evaluate the candidates $\mathcal{X}$ via $\mathcal{M}^*(\boldsymbol{x})$, selecting top K seed sets to be $\mathcal{X}^*$; (3) assess $\mathcal{X}^*$ via $\mathcal{M}(\boldsymbol{x})$ to expand $\mathcal{D}$. Finally, the optimal $\boldsymbol{x}^*$ from $\mathcal{D}$ with maximal influence is selected.
  • Figure 3: Performance comparison on Multi-LT (first row) and Multi-IC (second row) with the seed set size growth. The traditional approaches IMM and CELF++ exceeded time and memory limits under Multi-LT on the Synthetic dataset, and Multi-IC on Ciao, Epinions and Synthetic datasets.
  • Figure 4: Scalability of GBIM on the Multi-IC model of synthetic data. (a) Near-linear runtime scaling with the number of users. (b) Stable runtime as the number of items increases. (c) Linear runtime as the $|\mathcal{D}|$ increases.
  • Figure 5: Performance of GBIM over iterations on the Multi-LT model of Ciao dataset, with different exploit rates $\alpha$. An appropriate value of $\alpha$ balances exploitation versus exploration, allowing GBIM to optimize efficiently.

Theorems & Definitions (5)

  • Definition 1: Multiplex Influence Maximization
  • Definition 2: Association Mechanism
  • Definition 3: Multiplex Diffusion Model
  • Theorem 1
  • proof