Table of Contents
Fetching ...

A Comprehensive Low and High-level Feature Analysis for Early Rumor Detection on Twitter

Tu Nguyen

TL;DR

This work tackles early rumor detection on Twitter by combining a CNN+LSTM-based Single Tweet Credibility model with a Dynamic Series-Time Structure (DSTS) that tracks temporal feature dynamics over an event window. It constructs 50+ features across three groups (Ensemble, Twitter-based, Epidemiological) and analyzes their time-dependent impact, highlighting the predictive value of low-level tweet representations and the CreditScore ensemble in the initial hours. The integrated approach achieves competitive accuracy (over 90% in aggregate) and often surpasses baselines, with CreditScore providing especially strong early gains, as demonstrated on a curated 48-hour dataset and a Munich Shooting case study. The results offer practical guidance for real-time rumor debunking while outlining future directions to enhance embeddings and incorporate multimodal signals.

Abstract

Recent work have done a good job in modeling rumors and detecting them over microblog streams. However, the performance of their automatic approaches are not relatively high when looking early in the diffusion. A first intuition is that, at early stage, most of the aggregated rumor features (e.g., propagation features) are not mature and distinctive enough. The objective of rumor debunking in microblogs, however, are to detect these misinformation as early as possible. In this work, we leverage neural models in learning the hidden representations of individual rumor-related tweets at the very beginning of a rumor. Our extensive experiments show that the resulting signal improves our classification performance over time, significantly within the first 10 hours. To deepen the understanding of these low and high-level features in contributing to the model performance over time, we conduct an extensive study on a wide range of high impact rumor features for the 48 hours range. The end model that engages these features are shown to be competitive, reaches over 90% accuracy and out-performs strong baselines in our carefully cured dataset.

A Comprehensive Low and High-level Feature Analysis for Early Rumor Detection on Twitter

TL;DR

This work tackles early rumor detection on Twitter by combining a CNN+LSTM-based Single Tweet Credibility model with a Dynamic Series-Time Structure (DSTS) that tracks temporal feature dynamics over an event window. It constructs 50+ features across three groups (Ensemble, Twitter-based, Epidemiological) and analyzes their time-dependent impact, highlighting the predictive value of low-level tweet representations and the CreditScore ensemble in the initial hours. The integrated approach achieves competitive accuracy (over 90% in aggregate) and often surpasses baselines, with CreditScore providing especially strong early gains, as demonstrated on a curated 48-hour dataset and a Munich Shooting case study. The results offer practical guidance for real-time rumor debunking while outlining future directions to enhance embeddings and incorporate multimodal signals.

Abstract

Recent work have done a good job in modeling rumors and detecting them over microblog streams. However, the performance of their automatic approaches are not relatively high when looking early in the diffusion. A first intuition is that, at early stage, most of the aggregated rumor features (e.g., propagation features) are not mature and distinctive enough. The objective of rumor debunking in microblogs, however, are to detect these misinformation as early as possible. In this work, we leverage neural models in learning the hidden representations of individual rumor-related tweets at the very beginning of a rumor. Our extensive experiments show that the resulting signal improves our classification performance over time, significantly within the first 10 hours. To deepen the understanding of these low and high-level features in contributing to the model performance over time, we conduct an extensive study on a wide range of high impact rumor features for the 48 hours range. The end model that engages these features are shown to be competitive, reaches over 90% accuracy and out-performs strong baselines in our carefully cured dataset.

Paper Structure

This paper contains 27 sections, 3 equations, 13 figures, 12 tables.

Figures (13)

  • Figure 1: Pipeline of our rumor detection approach.
  • Figure 2: The fraction of tweets contain URLs with domain's rank less than 5000.
  • Figure 3: Fitting results of SIS and SEIZ model of (a) Rumor: Robert Byrd was a member of KKK (b) News: Doctor announces Michael Schumacher is making progress.
  • Figure 4: Fitting Results of SIS and SEIZ Model with Only First 10 Hours Tweets' Volume (same 2 stories as above)
  • Figure 5: Fitting Results of SpikeM Model of (a) Rumor: Robert Byrd was a member of KKK and (b) News: Doctor announces Michael Schumacher is making process
  • ...and 8 more figures