Table of Contents
Fetching ...

Source Localization for Cross Network Information Diffusion

Chen Ling, Tanmoy Chowdhury, Jie Ji, Sirui Li, Andreas Züfle, Liang Zhao

TL;DR

The paper tackles cross-network diffusion source localization by formulating the problem as recovering the source seed set on a source network from observations in a target network. It introduces CNSL, a framework that combines mean-field variational inference to approximate the latent distribution of seeds, a disentangled latent prior to separately encode dynamic and static node features, and a cross-network diffusion model that jointly learns propagation patterns on both networks while respecting cross-network bridge links. Key contributions include (i) a MAP-based latent-distribution learning approach with disentangled encoders, (ii) a cross-network diffusion model that decouples source and target propagation with a monotonicity constraint, and (iii) two new cross-network datasets (real and simulated) and comprehensive experiments showing CNSL outperforms single-network baselines across multiple diffusion patterns. The results demonstrate CNSL’s improved accuracy (F1, AUC, PR@100) and robustness, with practical implications for mitigating misinformation spread in interconnected platforms. Overall, the work provides a scalable, uncertainty-aware method for source localization in cross-network diffusion, highlighting significant theoretical and applied benefits for information integrity in interconnected systems.

Abstract

Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depends on the other. Source localization on cross-networks entails locating diffusion sources on the source network by only giving the diffused observation in the target network. The task is challenging due to challenges including: 1) diffusion sources distribution modeling; 2) jointly considering both static and dynamic node features; and 3) heterogeneous diffusion patterns learning. In this work, we propose a novel method, namely CNSL, to handle the three primary challenges. Specifically, we propose to learn the distribution of diffusion sources through Bayesian inference and leverage disentangled encoders to separately learn static and dynamic node features. The learning objective is coupled with the cross-network information propagation estimation model to make the inference of diffusion sources considering the overall diffusion process. Additionally, we also provide two novel cross-network datasets collected by ourselves. Extensive experiments are conducted on both datasets to demonstrate the effectiveness of \textit{CNSL} in handling the source localization on cross-networks.

Source Localization for Cross Network Information Diffusion

TL;DR

The paper tackles cross-network diffusion source localization by formulating the problem as recovering the source seed set on a source network from observations in a target network. It introduces CNSL, a framework that combines mean-field variational inference to approximate the latent distribution of seeds, a disentangled latent prior to separately encode dynamic and static node features, and a cross-network diffusion model that jointly learns propagation patterns on both networks while respecting cross-network bridge links. Key contributions include (i) a MAP-based latent-distribution learning approach with disentangled encoders, (ii) a cross-network diffusion model that decouples source and target propagation with a monotonicity constraint, and (iii) two new cross-network datasets (real and simulated) and comprehensive experiments showing CNSL outperforms single-network baselines across multiple diffusion patterns. The results demonstrate CNSL’s improved accuracy (F1, AUC, PR@100) and robustness, with practical implications for mitigating misinformation spread in interconnected platforms. Overall, the work provides a scalable, uncertainty-aware method for source localization in cross-network diffusion, highlighting significant theoretical and applied benefits for information integrity in interconnected systems.

Abstract

Source localization aims to locate information diffusion sources only given the diffusion observation, which has attracted extensive attention in the past few years. Existing methods are mostly tailored for single networks and may not be generalized to handle more complex networks like cross-networks. Cross-network is defined as two interconnected networks, where one network's functionality depends on the other. Source localization on cross-networks entails locating diffusion sources on the source network by only giving the diffused observation in the target network. The task is challenging due to challenges including: 1) diffusion sources distribution modeling; 2) jointly considering both static and dynamic node features; and 3) heterogeneous diffusion patterns learning. In this work, we propose a novel method, namely CNSL, to handle the three primary challenges. Specifically, we propose to learn the distribution of diffusion sources through Bayesian inference and leverage disentangled encoders to separately learn static and dynamic node features. The learning objective is coupled with the cross-network information propagation estimation model to make the inference of diffusion sources considering the overall diffusion process. Additionally, we also provide two novel cross-network datasets collected by ourselves. Extensive experiments are conducted on both datasets to demonstrate the effectiveness of \textit{CNSL} in handling the source localization on cross-networks.
Paper Structure (19 sections, 10 equations, 6 figures, 3 tables, 2 algorithms)

This paper contains 19 sections, 10 equations, 6 figures, 3 tables, 2 algorithms.

Figures (6)

  • Figure 1: Example of misinformation propagation on cross-network between GitHub and Stack Overflow, where each node in the GitHub network denotes a repository, and each node in the Stack Overflow represents a discussion thread.
  • Figure 2: The training pipeline of CNSL contains three steps: 1) $q_{\phi_1}$ and $q_{\phi_2}$ approximate the distribution of $p(z_s,z_{fs})$ in a disentangled manner; 2) the inferred latent variables $z_s$ and $z_{fs}$ are concatenated to reconstruct $\hat{x}_s$; 3) the reconstructed $\hat{x}_s$ is leveraged as initial seed nodes to initiate the cross-network information propagation and predict expected diffusion $\hat{y}_t$.
  • Figure 3: Runtime Comparison with learning based methods for dataset a) LT2LT, b) LT2IC c) LT2SIS, d) IC2LT, e) IC2IC, f) IC2SIS, g) G2S-A-D0, h) G2S-A-D1, i) G2S-B-D0, j) G2S-B-D1
  • Figure 4: Precision@100: the precision rate of the top 100 nodes being predicted as seed nodes. The comparison is conducted between our method: CNSL and the current state-of-the-art: SL-VAE.
  • Figure 5: The graphical model for CNSL, where the solid arrows indicate the variational approximation $q_{\phi_1}(z_s|x_s, G_s)$ and $q_{\phi_2}(z_{fs}|x_s,f_s, G_s)$ to the intractable posterior $p(Z|x_s,f_s, G_s)$. Dashed arrows denote the generative process that decodes $x_s$ from $p_{\theta}(x_s|Z)$ and predicts the information diffusion $p_{\psi}(y_t|x_s, \mathcal{G})$.
  • ...and 1 more figures