Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection
Junwei Yin, Min Gao, Kai Shu, Wentao Li, Yinqiu Huang, Zongwei Wang
TL;DR
BREAK tackles fake news detection by modeling broad-range semantics with a fully connected sentence graph while mitigating two noise types via dual denoising in a bi-level optimization. The inner module uses a sequence-based lower bound to refine the graph structure, producing a denoised representation, while the outer module aligns graph- and sequence-derived features through KL-divergence to yield $E_{str}$ and $E_{seq}$ that support robust detection. Empirical results on four real-world datasets show BREAK achieving state-of-the-art performance and strong generalization to evidence-enabled settings, outperforming baselines by several percentage points in F1. This approach offers a scalable, content-only framework that effectively captures long-range semantic interrelations for fake news detection with practical resilience to noise and varying article lengths.
Abstract
The rapid proliferation of fake news on social media threatens social stability, creating an urgent demand for more effective detection methods. While many promising approaches have emerged, most rely on content analysis with limited semantic depth, leading to suboptimal comprehension of news content.To address this limitation, capturing broader-range semantics is essential yet challenging, as it introduces two primary types of noise: fully connecting sentences in news graphs often adds unnecessary structural noise, while highly similar but authenticity-irrelevant sentences introduce feature noise, complicating the detection process. To tackle these issues, we propose BREAK, a broad-range semantics model for fake news detection that leverages a fully connected graph to capture comprehensive semantics while employing dual denoising modules to minimize both structural and feature noise. The semantic structure denoising module balances the graph's connectivity by iteratively refining it between two bounds: a sequence-based structure as a lower bound and a fully connected graph as the upper bound. This refinement uncovers label-relevant semantic interrelations structures. Meanwhile, the semantic feature denoising module reduces noise from similar semantics by diversifying representations, aligning distinct outputs from the denoised graph and sequence encoders using KL-divergence to achieve feature diversification in high-dimensional space. The two modules are jointly optimized in a bi-level framework, enhancing the integration of denoised semantics into a comprehensive representation for detection. Extensive experiments across four datasets demonstrate that BREAK significantly outperforms existing fake news detection methods.
