Table of Contents
Fetching ...

Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

Ahmad Diab, Rr. Nefriana, Yu-Ru Lin

TL;DR

This study develops a general, topic-independent scheme to classify online conspiracy narratives by incorporating authors’ stance and explicit/implicit narrative cues. A BERT-family classifier (notably RoBERTa) trained on a human-ground-truthed Reddit corpus is shown to perform competitively with GPT in detecting conspiracy narratives, highlighting GPT’s strengths in expressiveness but also critical weaknesses in reasoning and susceptibility to causal hallucinations. Using the best classifier, the authors conduct a large-scale analysis of CT narratives across active CT-related Reddit subreddits, finding about one-third of posts to be CT narratives and showing that CT-promoting posts tend to receive higher engagement. The cross-domain evaluation to Twitter data suggests the approach generalizes across platforms. Overall, the work advances CT detection by providing a theory-grounded, scalable classifier and offering insights into model limitations, platform dynamics, and potential moderation implications.

Abstract

Online discussions frequently involve conspiracy theories, which can contribute to the proliferation of belief in them. However, not all discussions surrounding conspiracy theories promote them, as some are intended to debunk them. Existing research has relied on simple proxies or focused on a constrained set of signals to identify conspiracy theories, which limits our understanding of conspiratorial discussions across different topics and online communities. This work establishes a general scheme for classifying discussions related to conspiracy theories based on authors' perspectives on the conspiracy belief, which can be expressed explicitly through narrative elements, such as the agent, action, or objective, or implicitly through references to known theories, such as chemtrails or the New World Order. We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs, which we then compared to the Generative Pre-trained Transformer machine (GPT) for detecting online conspiratorial content. Despite GPT's known strengths in its expressiveness and contextual understanding, our study revealed significant flaws in its logical reasoning, while also demonstrating comparable strengths from our classifiers. We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. This research sheds light on the potential applications of large language models in tasks demanding nuanced contextual comprehension.

Classifying Conspiratorial Narratives At Scale: False Alarms and Erroneous Connections

TL;DR

This study develops a general, topic-independent scheme to classify online conspiracy narratives by incorporating authors’ stance and explicit/implicit narrative cues. A BERT-family classifier (notably RoBERTa) trained on a human-ground-truthed Reddit corpus is shown to perform competitively with GPT in detecting conspiracy narratives, highlighting GPT’s strengths in expressiveness but also critical weaknesses in reasoning and susceptibility to causal hallucinations. Using the best classifier, the authors conduct a large-scale analysis of CT narratives across active CT-related Reddit subreddits, finding about one-third of posts to be CT narratives and showing that CT-promoting posts tend to receive higher engagement. The cross-domain evaluation to Twitter data suggests the approach generalizes across platforms. Overall, the work advances CT detection by providing a theory-grounded, scalable classifier and offering insights into model limitations, platform dynamics, and potential moderation implications.

Abstract

Online discussions frequently involve conspiracy theories, which can contribute to the proliferation of belief in them. However, not all discussions surrounding conspiracy theories promote them, as some are intended to debunk them. Existing research has relied on simple proxies or focused on a constrained set of signals to identify conspiracy theories, which limits our understanding of conspiratorial discussions across different topics and online communities. This work establishes a general scheme for classifying discussions related to conspiracy theories based on authors' perspectives on the conspiracy belief, which can be expressed explicitly through narrative elements, such as the agent, action, or objective, or implicitly through references to known theories, such as chemtrails or the New World Order. We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs, which we then compared to the Generative Pre-trained Transformer machine (GPT) for detecting online conspiratorial content. Despite GPT's known strengths in its expressiveness and contextual understanding, our study revealed significant flaws in its logical reasoning, while also demonstrating comparable strengths from our classifiers. We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. This research sheds light on the potential applications of large language models in tasks demanding nuanced contextual comprehension.
Paper Structure (33 sections, 1 figure, 6 tables)