Table of Contents
Fetching ...

SLURG: Investigating the Feasibility of Generating Synthetic Online Fallacious Discourse

Cal Blanco, Gavin Dsouza, Hugo Lin, Chelsey Rush

TL;DR

This study investigates the feasibility of generating synthetic fallacious discourse for online forums within the Ukrainian-Russian conflict using a unified fallacy taxonomy and large language models. It combines human annotation, inter-annotator agreement analysis, and zero/ few-shot LLM-based annotation and generation to build and assess synthetic data. The findings indicate that LLMs can reproduce real-world syntactic patterns and that careful few-shot prompting enhances vocabulary diversity, though realism and annotation subjectivity remain challenges. The work demonstrates the potential and limitations of synthetic data to support fallacy detection in domain-specific online discourse, with implications for detector training and media literacy while highlighting ethical considerations.

Abstract

In our paper we explore the definition, and extrapolation of fallacies as they pertain to the automatic detection of manipulation on social media. In particular we explore how these logical fallacies might appear in the real world i.e internet forums. We discovered a prevalence of misinformation / misguided intention in discussion boards specifically centered around the Ukrainian Russian Conflict which serves to narrow the domain of our task. Although automatic fallacy detection has gained attention recently, most datasets use unregulated fallacy taxonomies or are limited to formal linguistic domains like political debates or news reports. Online discourse, however, often features non-standardized and diverse language not captured in these domains. We present Shady Linguistic Utterance Replication-Generation (SLURG) to address these limitations, exploring the feasibility of generating synthetic fallacious forum-style comments using large language models (LLMs), specifically DeepHermes-3-Mistral-24B. Our findings indicate that LLMs can replicate the syntactic patterns of real data} and that high-quality few-shot prompts enhance LLMs' ability to mimic the vocabulary diversity of online forums.

SLURG: Investigating the Feasibility of Generating Synthetic Online Fallacious Discourse

TL;DR

This study investigates the feasibility of generating synthetic fallacious discourse for online forums within the Ukrainian-Russian conflict using a unified fallacy taxonomy and large language models. It combines human annotation, inter-annotator agreement analysis, and zero/ few-shot LLM-based annotation and generation to build and assess synthetic data. The findings indicate that LLMs can reproduce real-world syntactic patterns and that careful few-shot prompting enhances vocabulary diversity, though realism and annotation subjectivity remain challenges. The work demonstrates the potential and limitations of synthetic data to support fallacy detection in domain-specific online discourse, with implications for detector training and media literacy while highlighting ethical considerations.

Abstract

In our paper we explore the definition, and extrapolation of fallacies as they pertain to the automatic detection of manipulation on social media. In particular we explore how these logical fallacies might appear in the real world i.e internet forums. We discovered a prevalence of misinformation / misguided intention in discussion boards specifically centered around the Ukrainian Russian Conflict which serves to narrow the domain of our task. Although automatic fallacy detection has gained attention recently, most datasets use unregulated fallacy taxonomies or are limited to formal linguistic domains like political debates or news reports. Online discourse, however, often features non-standardized and diverse language not captured in these domains. We present Shady Linguistic Utterance Replication-Generation (SLURG) to address these limitations, exploring the feasibility of generating synthetic fallacious forum-style comments using large language models (LLMs), specifically DeepHermes-3-Mistral-24B. Our findings indicate that LLMs can replicate the syntactic patterns of real data} and that high-quality few-shot prompts enhance LLMs' ability to mimic the vocabulary diversity of online forums.

Paper Structure

This paper contains 35 sections, 1 equation, 9 figures, 2 tables.

Figures (9)

  • Figure 1: SLURGY SAMMY
  • Figure 2: Hierarchical structure of fallacies as constructed by helwe2023mafalda.
  • Figure 3: Token frequency counts for the top 10 most frequent tokens in the data scraped from Reddit (top) and 4chan (bottom).
  • Figure 4: The upper bar plot shows the 10 most frequently occurring tokens and their counts for the data collected from Reddit (orange) and 4chan (green). The lower plot shows the vocabulary diversity within the data collected from each platform, with 4chan having a noticeably higher vocabulary diversity than Reddit despite having less samples.
  • Figure 5: Inter-annotator agreement scores between all four annotators, as calculated using the Jaccard Index for both label and span for each of the 150 samples.
  • ...and 4 more figures