Table of Contents
Fetching ...

IM-META: Influence Maximization Using Node Metadata in Networks With Unknown Topology

Cong Tran, Won-Yong Shin, Andreas Spitz

TL;DR

The proposed IM-META is a solution to influence maximization (IM) in networks with unknown topology by retrieving information from queries and node metadata by retrieving information from queries and node metadata.

Abstract

Since the structure of complex networks is often unknown, we may identify the most influential seed nodes by exploring only a part of the underlying network, given a small budget for node queries. We propose IM-META, a solution to influence maximization (IM) in networks with unknown topology by retrieving information from queries and node metadata. Since using such metadata is not without risk due to the noisy nature of metadata and uncertainties in connectivity inference, we formulate a new IM problem that aims to find both seed nodes and queried nodes. In IM-META, we develop an effective method that iteratively performs three steps: 1) we learn the relationship between collected metadata and edges via a Siamese neural network, 2) we select a number of inferred confident edges to construct a reinforced graph, and 3) we identify the next node to query by maximizing the inferred influence spread using our topology-aware ranking strategy. Through experimental evaluation of IM-META on four real-world datasets, we demonstrate a) the speed of network exploration via node queries, b) the effectiveness of each module, c) the superiority over benchmark methods, d) the robustness to more difficult settings, e) the hyperparameter sensitivity, and f) the scalability.

IM-META: Influence Maximization Using Node Metadata in Networks With Unknown Topology

TL;DR

The proposed IM-META is a solution to influence maximization (IM) in networks with unknown topology by retrieving information from queries and node metadata by retrieving information from queries and node metadata.

Abstract

Since the structure of complex networks is often unknown, we may identify the most influential seed nodes by exploring only a part of the underlying network, given a small budget for node queries. We propose IM-META, a solution to influence maximization (IM) in networks with unknown topology by retrieving information from queries and node metadata. Since using such metadata is not without risk due to the noisy nature of metadata and uncertainties in connectivity inference, we formulate a new IM problem that aims to find both seed nodes and queried nodes. In IM-META, we develop an effective method that iteratively performs three steps: 1) we learn the relationship between collected metadata and edges via a Siamese neural network, 2) we select a number of inferred confident edges to construct a reinforced graph, and 3) we identify the next node to query by maximizing the inferred influence spread using our topology-aware ranking strategy. Through experimental evaluation of IM-META on four real-world datasets, we demonstrate a) the speed of network exploration via node queries, b) the effectiveness of each module, c) the superiority over benchmark methods, d) the robustness to more difficult settings, e) the hyperparameter sensitivity, and f) the scalability.

Paper Structure

This paper contains 29 sections, 2 theorems, 3 equations, 11 figures, 4 tables, 2 algorithms.

Key Result

Theorem 1

The computational complexity of the proposed IM-META method is no higher than that of the greedy IM algorithm.

Figures (11)

  • Figure 1: A schematic overview of our proposed IM-META method for $T=1$ node query and $k=2$ seed nodes. Here, after adding seven inferred edges to obtain $G^{(0)}_\text{gen-prun}$, node $v_2$ (a filled-in yellow circle) is selected. Then, the seed set of two nodes $v_1$ and $v_3$ (filled-in red circles) is chosen from $\mathcal{V}_1$ in the subgraph $G_1$.
  • Figure 2: An illustration of reinforced weighted graph generation for $\epsilon=0.5$ under the two diffusion models. Here, the value on each dashed line indicates $\theta_{uv}^{(t)} \in \Theta^{(t)}$, while the value on each solid line represents the diffusion probability in the reinforced weighted graph $G_\text{gen-prun}^{(t)}$.
  • Figure 3: An example illustrating that querying a high degree node is not always beneficial given an initial subgraph $G_0$ when $T=k=1$ and $G_\text{gen-prun}^{(0)}$ correctly reflects the underlying graph $G$.
  • Figure 4: The size of explored subgraphs, $|\mathcal{V}_t|$, as a function of the number of queried nodes, $t$.
  • Figure 5: $\sigma$ as a function of $T$ when $k = 5$ (IC model).
  • ...and 6 more figures

Theorems & Definitions (6)

  • Example 1
  • Definition 1
  • Example 2
  • Theorem 1
  • Theorem 2
  • proof