Table of Contents
Fetching ...

GIN-SD: Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion

Le Cheng, Peican Zhu, Keke Tang, Chao Gao, Zhen Wang

TL;DR

GIN-SD tackles rumor source detection under incomplete node data by combining a Positional Embedding Module that uses infected-subgraph Laplacian PEs with user state and diffusion features and an Attentive Fusion Module that uses multi-head self-attention to weight informative nodes. A class-balancing loss addresses the inherent source/non-source imbalance, enabling effective learning despite missing data. Across eight real-world networks and varying incomplete-ratio scenarios, GIN-SD consistently outperforms state-of-the-art methods and demonstrates notable robustness in early-detection settings. The approach has practical implications for robust rumor source localization under privacy and data-loss constraints, leveraging propagation dynamics to improve accuracy and resilience.

Abstract

Source detection in graphs has demonstrated robust efficacy in the domain of rumor source identification. Although recent solutions have enhanced performance by leveraging deep neural networks, they often require complete user data. In this paper, we address a more challenging task, rumor source detection with incomplete user data, and propose a novel framework, i.e., Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion (GIN-SD), to tackle this challenge. Specifically, our approach utilizes a positional embedding module to distinguish nodes that are incomplete and employs a self-attention mechanism to focus on nodes with greater information transmission capacity. To mitigate the prediction bias caused by the significant disparity between the numbers of source and non-source nodes, we also introduce a class-balancing mechanism. Extensive experiments validate the effectiveness of GIN-SD and its superiority to state-of-the-art methods.

GIN-SD: Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion

TL;DR

GIN-SD tackles rumor source detection under incomplete node data by combining a Positional Embedding Module that uses infected-subgraph Laplacian PEs with user state and diffusion features and an Attentive Fusion Module that uses multi-head self-attention to weight informative nodes. A class-balancing loss addresses the inherent source/non-source imbalance, enabling effective learning despite missing data. Across eight real-world networks and varying incomplete-ratio scenarios, GIN-SD consistently outperforms state-of-the-art methods and demonstrates notable robustness in early-detection settings. The approach has practical implications for robust rumor source localization under privacy and data-loss constraints, leveraging propagation dynamics to improve accuracy and resilience.

Abstract

Source detection in graphs has demonstrated robust efficacy in the domain of rumor source identification. Although recent solutions have enhanced performance by leveraging deep neural networks, they often require complete user data. In this paper, we address a more challenging task, rumor source detection with incomplete user data, and propose a novel framework, i.e., Source Detection in Graphs with Incomplete Nodes via Positional Encoding and Attentive Fusion (GIN-SD), to tackle this challenge. Specifically, our approach utilizes a positional embedding module to distinguish nodes that are incomplete and employs a self-attention mechanism to focus on nodes with greater information transmission capacity. To mitigate the prediction bias caused by the significant disparity between the numbers of source and non-source nodes, we also introduce a class-balancing mechanism. Extensive experiments validate the effectiveness of GIN-SD and its superiority to state-of-the-art methods.
Paper Structure (32 sections, 20 equations, 4 figures, 4 tables)

This paper contains 32 sections, 20 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Impact of incomplete nodes on source detection: (a-c) graphs with incomplete node ratios of 0%, 10%, and 20%; (d) influence of varying incomplete node ratios on the source detection accuracy for different methods. As the proportion of incomplete nodes increases, the performance of other methods declines more significantly, while our approach remains less affected.
  • Figure 2: Illustration of GIN-SD. (a) The network snapshot $G'$ serves as the input of GIN-SD. (b) The Positional Embedding Module (PEM), where node positional information, along with state and propagation information, is embedded into feature vectors. It is noteworthy that during the position embedding process, the infected subgraph is initially extracted from the acquired snapshot, and the adjacency matrix of the infected subgraph is obtained. Subsequently, the symmetric normalized Laplacian matrix is calculated, and the positional encoding of each node are derived through factorization. (c) The Attentive Fusion Module (AFM) learns node representations through self-attention mechanisms. (d) The training loss is computed using the class-balancing mechanism, and the detected source set $\hat{s}$ is output during the testing phase.
  • Figure 3: The performance of different methods in early rumor sources detection.
  • Figure 4: The impact of varying degrees of user information loss on source detection accuracy.