Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

Xuan Zhang; Wei Gao

Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

Xuan Zhang, Wei Gao

TL;DR

This work proposes a novel approach to predict viral rumors and vulnerable users using a unified graph neural network model that effectively captures the correlation between rumor virality and user vulnerability, leveraging this information to improve prediction performance and provide a valuable tool for infodemic surveillance.

Abstract

In the age of the infodemic, it is crucial to have tools for effectively monitoring the spread of rampant rumors that can quickly go viral, as well as identifying vulnerable users who may be more susceptible to spreading such misinformation. This proactive approach allows for timely preventive measures to be taken, mitigating the negative impact of false information on society. We propose a novel approach to predict viral rumors and vulnerable users using a unified graph neural network model. We pre-train network-based user embeddings and leverage a cross-attention mechanism between users and posts, together with a community-enhanced vulnerability propagation (CVP) method to improve user and propagation graph representations. Furthermore, we employ two multi-task training strategies to mitigate negative transfer effects among tasks in different settings, enhancing the overall performance of our approach. We also construct two datasets with ground-truth annotations on information virality and user vulnerability in rumor and non-rumor events, which are automatically derived from existing rumor detection datasets. Extensive evaluation results of our joint learning model confirm its superiority over strong baselines in all three tasks: rumor detection, virality prediction, and user vulnerability scoring. For instance, compared to the best baselines based on the Weibo dataset, our model makes 3.8\% and 3.0\% improvements on Accuracy and MacF1 for rumor detection, and reduces mean squared error (MSE) by 23.9\% and 16.5\% for virality prediction and user vulnerability scoring, respectively. Our findings suggest that our approach effectively captures the correlation between rumor virality and user vulnerability, leveraging this information to improve prediction performance and provide a valuable tool for infodemic surveillance.

Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

TL;DR

Abstract

Paper Structure (41 sections, 4 equations, 9 figures, 5 tables, 1 algorithm)

This paper contains 41 sections, 4 equations, 9 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Rumor Detection
Information Virality Prediction
User Vulnerability Analysis
Graph Neural Networks (GNNs)
Problem Definition
Methodology
User Interaction Graph Construction
Input Embedding
Time-aware Post Embedding
Pre-train User Embedding with Contrastive Learning
Refined Embedding
Output Layers
Graph Classification for Rumor Detection and Virality Prediction
...and 26 more sections

Figures (9)

Figure 1: An application scenario of the infodemic surveillance system that can predict viral rumors and vulnerable users.
Figure 2: Example rumors and non-rumors of different virality and reposts to them by users of different vulnerabilities, taken from TWITTER dataset ma2017detect. Virality is defined as the number of users involved in the spread, and user vulnerability is defined as the fraction of rumor events over all events a user engaged in.
Figure 3: Overview of the proposed multi-task model. (a) Construction. The user interaction network $\mathcal{G}_u$ is constructed based on its corresponding post propagation network $\mathcal{G}$. (b) Input Embedding. We generate time-aware post embedding $\mathbf{X}_p$, and general user embedding $\mathbf{X}_u^{(0)}$. (c) Refined Embedding. We obtain latent user community information $\mathbf{X}_c$ via Diffpool. (d) Output Layers. The final graph representation for $\mathcal{G}_u$ and user representations $\mathbf{X}_u^{(4)}$ updated via CVP are fed to the corresponding classifiers for our three tasks. Note that the three tasks become separated only at (d) Output Layer, while the other layers (a) Construction, (b) Input Embedding, and (c) Refined Embedding are all needed for the three tasks as parts of their inputs and representation learning process
Figure 4: An illustration of user interaction network construction. For any pair of unique users, we create an edge between them in the user interaction network as long as there is a reposting behavior between their posts in any post propagation network.
Figure 5: The impact of hyper-parameters that are related to model's complexity based on the validate set.
...and 4 more figures

Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

TL;DR

Abstract

Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

Authors

TL;DR

Abstract

Table of Contents

Figures (9)