Table of Contents
Fetching ...

On Early-stage Debunking Rumors on Twitter: Leveraging the Wisdom of Weak Learners

Tu Nguyen, Cheng Li, Claudia Niederée

TL;DR

This work tackles the challenge of detecting rumors on Twitter at the very early stages by combining a tweet-level credibility model built with a CNN that learns representations for individual tweets and an ensemble time-series classifier that aggregates weak tweet signals into CreditScore, then fuses this with dynamic features under a Dynamic Series-Time Structure. The approach emphasizes minimal reliance on propagation features, focusing instead on robust text-derived signals and temporally aware ensembles, yielding strong early performance (notably within the first hours) and surpassing event-level deep models. Key contributions include a CNN+LSTM tweet representation, a time-series rumor detection framework with CrowdWisdom and CreditScore features, and extensive experiments showing improved early accuracy and insights into feature importance. The findings have practical impact for crisis response and real-time rumor mitigation, while also highlighting sub-event and sub-structure challenges that warrant further refinement.

Abstract

Recently a lot of progress has been made in rumor modeling and rumor detection for micro-blogging streams. However, existing automated methods do not perform very well for early rumor detection, which is crucial in many settings, e.g., in crisis situations. One reason for this is that aggregated rumor features such as propagation features, which work well on the long run, are - due to their accumulating characteristic - not very helpful in the early phase of a rumor. In this work, we present an approach for early rumor detection, which leverages Convolutional Neural Networks for learning the hidden representations of individual rumor-related tweets to gain insights on the credibility of each tweets. We then aggregate the predictions from the very beginning of a rumor to obtain the overall event credits (so-called wisdom), and finally combine it with a time series based rumor classification model. Our extensive experiments show a clearly improved classification performance within the critical very first hours of a rumor. For a better understanding, we also conduct an extensive feature evaluation that emphasized on the early stage and shows that the low-level credibility has best predictability at all phases of the rumor lifetime.

On Early-stage Debunking Rumors on Twitter: Leveraging the Wisdom of Weak Learners

TL;DR

This work tackles the challenge of detecting rumors on Twitter at the very early stages by combining a tweet-level credibility model built with a CNN that learns representations for individual tweets and an ensemble time-series classifier that aggregates weak tweet signals into CreditScore, then fuses this with dynamic features under a Dynamic Series-Time Structure. The approach emphasizes minimal reliance on propagation features, focusing instead on robust text-derived signals and temporally aware ensembles, yielding strong early performance (notably within the first hours) and surpassing event-level deep models. Key contributions include a CNN+LSTM tweet representation, a time-series rumor detection framework with CrowdWisdom and CreditScore features, and extensive experiments showing improved early accuracy and insights into feature importance. The findings have practical impact for crisis response and real-time rumor mitigation, while also highlighting sub-event and sub-structure challenges that warrant further refinement.

Abstract

Recently a lot of progress has been made in rumor modeling and rumor detection for micro-blogging streams. However, existing automated methods do not perform very well for early rumor detection, which is crucial in many settings, e.g., in crisis situations. One reason for this is that aggregated rumor features such as propagation features, which work well on the long run, are - due to their accumulating characteristic - not very helpful in the early phase of a rumor. In this work, we present an approach for early rumor detection, which leverages Convolutional Neural Networks for learning the hidden representations of individual rumor-related tweets to gain insights on the credibility of each tweets. We then aggregate the predictions from the very beginning of a rumor to obtain the overall event credits (so-called wisdom), and finally combine it with a time series based rumor classification model. Our extensive experiments show a clearly improved classification performance within the critical very first hours of a rumor. For a better understanding, we also conduct an extensive feature evaluation that emphasized on the early stage and shows that the low-level credibility has best predictability at all phases of the rumor lifetime.

Paper Structure

This paper contains 19 sections, 1 equation, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The Munich shooting and its sub-events burst after the first 8 hours, y-axis is English tweet volume.
  • Figure 2: Pipeline of our rumor detection approach.
  • Figure 3: CNN+LSTM for tweet representation.
  • Figure 4: Accuracy: All features with and without CreditScore.
  • Figure 5: Creditscore and ContainsNews for Munich shooting in red lines, compared with the corresponding average scores for rumor and news.
  • ...and 1 more figures