Table of Contents
Fetching ...

Unmasking Superspreaders: Data-Driven Approaches for Identifying and Comparing Key Influencers of Conspiracy Theories on X.com

Florian Kramer, Henrich R. Greve, Moritz von Zahn, Hayagreeva Rao

TL;DR

Addresses the problem of identifying and comparing Human Superspreaders and Bot Spreaders propagating conspiracy theories on X.com. Introduces a data-driven framework using over $7{,}616{,}569$ COVID-era tweets and 27 ranking metrics, including an Engagement Score $ES_i$ with $ES_i=RT_i+R_i+L_i+Q_i$ and its normalization $ES_i^{norm}$, plus adapted $H$‑Index, $M$‑Index, and $G$‑Index to rank users. Shows that the adapted $H$‑Index and $G$‑Index effectively identify Human Superspreaders and reveal distinct linguistic, emotional, toxicity, and political-orientation patterns between humans and Bot Spreaders. Argues for interpretable, metric-driven interventions (moderation, suspensions, and awareness campaigns) while noting limitations such as platform/time specificity and survivor bias.

Abstract

Conspiracy theories can threaten society by spreading misinformation, deepening polarization, and eroding trust in democratic institutions. Social media often fuels the spread of conspiracies, primarily driven by two key actors: Superspreaders -- influential individuals disseminating conspiracy content at disproportionately high rates, and Bots -- automated accounts designed to amplify conspiracies strategically. To counter the spread of conspiracy theories, it is critical to both identify these actors and to better understand their behavior. However, a systematic analysis of these actors as well as real-world-applicable identification methods are still lacking. In this study, we leverage over seven million tweets from the COVID-19 pandemic to analyze key differences between Human Superspreaders and Bots across dimensions such as linguistic complexity, toxicity, and hashtag usage. Our analysis reveals distinct communication strategies: Superspreaders tend to use more complex language and substantive content while relying less on structural elements like hashtags and emojis, likely to enhance credibility and authority. By contrast, Bots favor simpler language and strategic cross-usage of hashtags, likely to increase accessibility, facilitate infiltration into trending discussions, and amplify reach. To counter both Human Superspreaders and Bots, we propose and evaluate 27 novel metrics for quantifying the severity of conspiracy theory spread. Our findings highlight the effectiveness of an adapted H-Index for computationally feasible identification of Human Superspreaders. By identifying behavioral patterns unique to Human Superspreaders and Bots as well as providing suitable identification methods, this study provides a foundation for mitigation strategies, including platform moderation policies, temporary and permanent account suspensions, and public awareness campaigns.

Unmasking Superspreaders: Data-Driven Approaches for Identifying and Comparing Key Influencers of Conspiracy Theories on X.com

TL;DR

Addresses the problem of identifying and comparing Human Superspreaders and Bot Spreaders propagating conspiracy theories on X.com. Introduces a data-driven framework using over COVID-era tweets and 27 ranking metrics, including an Engagement Score with and its normalization , plus adapted ‑Index, ‑Index, and ‑Index to rank users. Shows that the adapted ‑Index and ‑Index effectively identify Human Superspreaders and reveal distinct linguistic, emotional, toxicity, and political-orientation patterns between humans and Bot Spreaders. Argues for interpretable, metric-driven interventions (moderation, suspensions, and awareness campaigns) while noting limitations such as platform/time specificity and survivor bias.

Abstract

Conspiracy theories can threaten society by spreading misinformation, deepening polarization, and eroding trust in democratic institutions. Social media often fuels the spread of conspiracies, primarily driven by two key actors: Superspreaders -- influential individuals disseminating conspiracy content at disproportionately high rates, and Bots -- automated accounts designed to amplify conspiracies strategically. To counter the spread of conspiracy theories, it is critical to both identify these actors and to better understand their behavior. However, a systematic analysis of these actors as well as real-world-applicable identification methods are still lacking. In this study, we leverage over seven million tweets from the COVID-19 pandemic to analyze key differences between Human Superspreaders and Bots across dimensions such as linguistic complexity, toxicity, and hashtag usage. Our analysis reveals distinct communication strategies: Superspreaders tend to use more complex language and substantive content while relying less on structural elements like hashtags and emojis, likely to enhance credibility and authority. By contrast, Bots favor simpler language and strategic cross-usage of hashtags, likely to increase accessibility, facilitate infiltration into trending discussions, and amplify reach. To counter both Human Superspreaders and Bots, we propose and evaluate 27 novel metrics for quantifying the severity of conspiracy theory spread. Our findings highlight the effectiveness of an adapted H-Index for computationally feasible identification of Human Superspreaders. By identifying behavioral patterns unique to Human Superspreaders and Bots as well as providing suitable identification methods, this study provides a foundation for mitigation strategies, including platform moderation policies, temporary and permanent account suspensions, and public awareness campaigns.
Paper Structure (18 sections, 3 equations, 7 figures, 2 tables)

This paper contains 18 sections, 3 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Dismantling Analysis for Identifying the Most Influential Human Superspreaders
  • Figure 2: Comparison of H-Index and G-Index using Difference in Remaining Users
  • Figure 3: Comparison of H-Index and G-Index using Cramér–von Mises Two-Sample Test
  • Figure 4: Analysis of Linguistic Features and Content Characteristics using Binary Variables
  • Figure 5: Visualization of Strategic Hashtag Pairing using Co-Hashtag Networks of Bot Spreaders — Node color indicates hashtag type: conspiracy-related hashtags are shown in red, non-conspiracy hashtags in gray
  • ...and 2 more figures