Table of Contents
Fetching ...

You are a Bot! -- Studying the Development of Bot Accusations on Twitter

Dennis Assenmacher, Leon Fröhling, Claudia Wagner

TL;DR

The paper analyzes how Twitter users accuse others of being bots, revealing a shift from automation-related labeling to dehumanizing insults after 2017, especially in polarized political contexts. It introduces the You're a bot! datasets and employs a multi-method approach (BERTweet-based accusation detection, Word2Vec temporal embeddings, clustering, toxicity assessment, Botometer scores, and ideology estimation) to understand evolution, context, and alignment with academic bot definitions. The findings show bot accusations are a poor ground truth for bot detection due to concept drift and social dehumanization, with significant implications for researchers and platform discourse. The work highlights the dynamic nature of bot labeling on social media and provides a foundation for future research on the social effects of accusations and dehumanization in online conversations.

Abstract

The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the characterization and detection of bots may want to tap into the wisdom of the crowd. But how many people need to accuse another user as a bot before we can assume that the account is most likely automated? And more importantly, are bot accusations on social media at all a valid signal for the detection of bots? Our research presents the first large-scale study of bot accusations on Twitter and shows how the term bot became an instrument of dehumanization in social media conversations since it is predominantly used to deny the humanness of conversation partners. Consequently, bot accusations on social media should not be naively used as a signal to train or test bot detection models.

You are a Bot! -- Studying the Development of Bot Accusations on Twitter

TL;DR

The paper analyzes how Twitter users accuse others of being bots, revealing a shift from automation-related labeling to dehumanizing insults after 2017, especially in polarized political contexts. It introduces the You're a bot! datasets and employs a multi-method approach (BERTweet-based accusation detection, Word2Vec temporal embeddings, clustering, toxicity assessment, Botometer scores, and ideology estimation) to understand evolution, context, and alignment with academic bot definitions. The findings show bot accusations are a poor ground truth for bot detection due to concept drift and social dehumanization, with significant implications for researchers and platform discourse. The work highlights the dynamic nature of bot labeling on social media and provides a foundation for future research on the social effects of accusations and dehumanization in online conversations.

Abstract

The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the characterization and detection of bots may want to tap into the wisdom of the crowd. But how many people need to accuse another user as a bot before we can assume that the account is most likely automated? And more importantly, are bot accusations on social media at all a valid signal for the detection of bots? Our research presents the first large-scale study of bot accusations on Twitter and shows how the term bot became an instrument of dehumanization in social media conversations since it is predominantly used to deny the humanness of conversation partners. Consequently, bot accusations on social media should not be naively used as a signal to train or test bot detection models.
Paper Structure (21 sections, 7 figures, 2 tables)

This paper contains 21 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Dataset structure with different accusation situations over time. Because of the data collection strategy, the objects $user_A$ and $tweet_A$ are always present in any accusation situation, whereas the objects $user_O$ and $tweet_O$ could be missing since the tweet and/or the user account have been deleted and are consequently no longer available through the Twitter API.
  • Figure 2: Number of instances in the different data subsets. While the sizes of the different datasets differ, their developments follow the same trends, like the steep increase after 2017.
  • Figure 3: Development of Detoxify toxicity scores for $tweet_O$ and $tweet_A$ objects. Comparison with the baseline of generic Reply-Tweets indicates that the observed increase in toxicity for the accusations ($tweet_A$) after 2016 cannot be explained through an overall increase of toxicity on Twitter over time.
  • Figure 4: Top 100 clusters for the Tweets sent by $user_O$ (the user accused of being a bot) for 2021 projected onto a 2-dimensional space with UMAP. Clusters are annotated with their highest cTFIDF terms. Users are not anymore accused only in the context of automated behavior ("good morning spam") but specifically in the context of polarizing debates around topics like covid/mask/vaccine, election/Biden/Trump or Scottish independence. Additionally, we observe a large cluster indicating bot accusation loops (e.g. $user_O$: you are a bot - blocked! $user_A$: i'm not a bot, but you are! ).
  • Figure 5: Ideology scores for $user_O$ and $user_A$ according to barbera2015birdsbarbera2015birds. Negative scores are associated with Left-leaning political positions, while positive scores are associated with Right-leaning political positions. One can see that most accusations occur in polarizing topics and are directed from left-leaning towards right-leaning users (lower right quadrant). Users on the political Right tend to not accuse each other, while users on the political Left accuse each other more frequently.
  • ...and 2 more figures