Enhancing Fake News Detection in Social Media via Label Propagation on Cross-modal Tweet Graph
Wanqing Zhao, Yuta Nakashima, Haiyuan Chen, Noboru Babaguchi
TL;DR
This work tackles fake-news detection on social media by densifying the interaction graph with cross-modal connections derived from CLIP, addressing sparsity in traditional social-context graphs. It introduces FCN-LP, combining a Feature Contextualization Network with a signed-label Propagation Network to leverage positive and negative correlations among tweets. A domain-generalization loss based on Maximum Mean Discrepancy encourages feature consistency between seen and unseen events, improving generalization to new contexts. Evaluations on Twitter, PHEME, and Weibo show consistent improvements over state-of-the-art multimodal detectors, and ablations confirm the benefits of contextualization, sign-aware propagation, and domain-generalization.
Abstract
Fake news detection in social media has become increasingly important due to the rapid proliferation of personal media channels and the consequential dissemination of misleading information. Existing methods, which primarily rely on multimodal features and graph-based techniques, have shown promising performance in detecting fake news. However, they still face a limitation, i.e., sparsity in graph connections, which hinders capturing possible interactions among tweets. This challenge has motivated us to explore a novel method that densifies the graph's connectivity to capture denser interaction better. Our method constructs a cross-modal tweet graph using CLIP, which encodes images and text into a unified space, allowing us to extract potential connections based on similarities in text and images. We then design a Feature Contextualization Network with Label Propagation (FCN-LP) to model the interaction among tweets as well as positive or negative correlations between predicted labels of connected tweets. The propagated labels from the graph are weighted and aggregated for the final detection. To enhance the model's generalization ability to unseen events, we introduce a domain generalization loss that ensures consistent features between tweets on seen and unseen events. We use three publicly available fake news datasets, Twitter, PHEME, and Weibo, for evaluation. Our method consistently improves the performance over the state-of-the-art methods on all benchmark datasets and effectively demonstrates its aptitude for generalizing fake news detection in social media.
