Table of Contents
Fetching ...

GNN-Based Candidate Node Predictor for Influence Maximization in Temporal Graphs

Priyanka Gautam, Balasubramaniam Natarajan, Sai Munikoti, S M Ferdous, Mahantesh Halappanavar

TL;DR

This work tackles influence maximization on temporally evolving graphs by introducing a dynamic framework that blends Graph Neural Networks with Bi-directional LSTM to predict candidate seed nodes. The GNN-LSTM pipeline extracts spatial embeddings via GraphSAGE and models temporal evolution with BiLSTM, producing candidate sets whose seeds are selected by a Greedy algorithm, guided by the IFC heuristic to reduce computation. Empirical results on real-world and synthetic datasets show the approach achieves 81–98% of Greedy's spread while significantly reducing compute time, demonstrating strong scalability across network sizes. The method is adaptable to various diffusion models and can enhance rapid, scalable seed selection in domains like viral marketing and information diffusion in dynamic networks.

Abstract

In an age where information spreads rapidly across social media, effectively identifying influential nodes in dynamic networks is critical. Traditional influence maximization strategies often fail to keep up with rapidly evolving relationships and structures, leading to missed opportunities and inefficiencies. To address this, we propose a novel learning-based approach integrating Graph Neural Networks (GNNs) with Bidirectional Long Short-Term Memory (BiLSTM) models. This hybrid framework captures both structural and temporal dynamics, enabling accurate prediction of candidate nodes for seed set selection. The bidirectional nature of BiLSTM allows our model to analyze patterns from both past and future network states, ensuring adaptability to changes over time. By dynamically adapting to graph evolution at each time snapshot, our approach improves seed set calculation efficiency, achieving an average of 90% accuracy in predicting potential seed nodes across diverse networks. This significantly reduces computational overhead by optimizing the number of nodes evaluated for seed selection. Our method is particularly effective in fields like viral marketing and social network analysis, where understanding temporal dynamics is crucial.

GNN-Based Candidate Node Predictor for Influence Maximization in Temporal Graphs

TL;DR

This work tackles influence maximization on temporally evolving graphs by introducing a dynamic framework that blends Graph Neural Networks with Bi-directional LSTM to predict candidate seed nodes. The GNN-LSTM pipeline extracts spatial embeddings via GraphSAGE and models temporal evolution with BiLSTM, producing candidate sets whose seeds are selected by a Greedy algorithm, guided by the IFC heuristic to reduce computation. Empirical results on real-world and synthetic datasets show the approach achieves 81–98% of Greedy's spread while significantly reducing compute time, demonstrating strong scalability across network sizes. The method is adaptable to various diffusion models and can enhance rapid, scalable seed selection in domains like viral marketing and information diffusion in dynamic networks.

Abstract

In an age where information spreads rapidly across social media, effectively identifying influential nodes in dynamic networks is critical. Traditional influence maximization strategies often fail to keep up with rapidly evolving relationships and structures, leading to missed opportunities and inefficiencies. To address this, we propose a novel learning-based approach integrating Graph Neural Networks (GNNs) with Bidirectional Long Short-Term Memory (BiLSTM) models. This hybrid framework captures both structural and temporal dynamics, enabling accurate prediction of candidate nodes for seed set selection. The bidirectional nature of BiLSTM allows our model to analyze patterns from both past and future network states, ensuring adaptability to changes over time. By dynamically adapting to graph evolution at each time snapshot, our approach improves seed set calculation efficiency, achieving an average of 90% accuracy in predicting potential seed nodes across diverse networks. This significantly reduces computational overhead by optimizing the number of nodes evaluated for seed selection. Our method is particularly effective in fields like viral marketing and social network analysis, where understanding temporal dynamics is crucial.

Paper Structure

This paper contains 20 sections, 6 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: Overall flow diagram for the proposed framework to compute the seed set, i.e., most influential nodes. The framework considers graph snapshots for $t$ time intervals that are processed through a GraphSAGE-based GNN to extract spatial features, generating embeddings $H_t^l$ for each snapshot. These embeddings are then fed into a BI-LSTM, which uses a sequence-to-sequence model to capture temporal dependencies. For example in this setup, the BI-LSTM utilizes the embeddings from the previous d time intervals, i.e, d = 3 ($H_{t-3}^l$, $H_{t-2}^l$, $H_{t-1}^l$) to predict the embedding for the next interval ($H_t^l$). The candidate node predictor identifies a candidate set of influential nodes, and the Greedy algorithm selects the optimal seed set $S_{tk}$ from this candidate set for each snapshot.
  • Figure 2: Comparison of influence spread achieved by the Greedy algorithm versus our candidate node predictor approach across 19 graph snapshots on Email core temporal department messages network. The spread is measured using Monte Carlo simulations with 100 iterations ($mc = 100$) and a seed set size $k = 5$. The dips in influence spread are due to changes in the network structure (as few old edges are dropping and new edges are forming.)
  • Figure 3: Comparison of computational time taken by the Greedy algorithm versus our candidate node predictor approach across 19 graph snapshots on Email core temporal department messages network, measured in seconds. The experiments were conducted using Monte Carlo simulations with 100 iterations ($mc = 100$).
  • Figure 4: Comparison of influence spread achieved by the Greedy algorithm versus our candidate node predictor approach across 15 graph snapshots on Synthetic Random-Barabasi-Albert network with 1000 nodes. The spread is measured using Monte Carlo simulations with 1000 iterations ($mc = 1000$) and a seed set size $k = 5$.
  • Figure 5: Comparison of computational time taken by the Greedy algorithm versus our candidate node predictor approach across 15 graph snapshots on Synthetic Random-Barabasi-Albert network with 1000 nodes. The spread is measured using Monte Carlo simulations with 1000 iterations ($mc = 1000$) and a seed set size $k = 5$.