Table of Contents
Fetching ...

Anatomy of an AI-powered malicious social botnet

Kai-Cheng Yang, Filippo Menczer

TL;DR

This paper presents a real-world case study of Twitter botnets powered by state-of-the-art language models, identifying a dense cluster of 1,140 accounts (the fox8 botnet) linked to three suspicious domains and analyzed via self-revealing prompts. It documents coordinated following, inter-bot replies/retweets, and the use of ChatGPT to generate content promoting dubious sites, while showing that current LLM-content detectors struggle to separate these bots from humans in the wild. The authors evaluate detectors such as Botometer, OpenAI's AI Text Classifier, and GPTZero, finding Botometer ineffective against LLM-powered bots and GPTZero unreliable in field conditions, though the OpenAI detector shows promise at tweet-level with caveats. They propose an account-level detection approach based on per-tweet OpenAI scores, achieving an F1 of 0.84 on the fox8-23 dataset, but emphasize the need for larger, multilingual, field-captured data to generalize beyond this botnet. Overall, the work highlights practical threats posed by LLM-powered social bots and outlines methodological and regulatory paths to improve detection and resilience as AI-enabled manipulation disseminates online.

Abstract

Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.

Anatomy of an AI-powered malicious social botnet

TL;DR

This paper presents a real-world case study of Twitter botnets powered by state-of-the-art language models, identifying a dense cluster of 1,140 accounts (the fox8 botnet) linked to three suspicious domains and analyzed via self-revealing prompts. It documents coordinated following, inter-bot replies/retweets, and the use of ChatGPT to generate content promoting dubious sites, while showing that current LLM-content detectors struggle to separate these bots from humans in the wild. The authors evaluate detectors such as Botometer, OpenAI's AI Text Classifier, and GPTZero, finding Botometer ineffective against LLM-powered bots and GPTZero unreliable in field conditions, though the OpenAI detector shows promise at tweet-level with caveats. They propose an account-level detection approach based on per-tweet OpenAI scores, achieving an F1 of 0.84 on the fox8-23 dataset, but emphasize the need for larger, multilingual, field-captured data to generalize beyond this botnet. Overall, the work highlights practical threats posed by LLM-powered social bots and outlines methodological and regulatory paths to improve detection and resilience as AI-enabled manipulation disseminates online.

Abstract

Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.
Paper Structure (19 sections, 8 figures, 1 table)

This paper contains 19 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: Profile characteristics of the fox8 bots (N=1,140). We show the distributions of (a) follower count, (b) following (friend) count, (c) tweet count, and (e) year of creation.
  • Figure 2: Social networks of the fox8 bots. (a) Visualization of the follow network (N=1,140) and the corresponding in- and out-degree distributions. (b) Same as (a) but for the reply network (N=1,036). (c) Same as (a) but for the retweet network (N=1,058). (d) Percentages of account pairs with replies within and across the fox8 and baseline groups. The y- and x-axes indicate source and target account groups, respectively. (e) Same as (d) but for retweets.
  • Figure 3: Distributions of tweet types for bot and human accounts in the fox8-23 dataset. (a) Each bot account is mapped along three axes representing the percentages of different tweet types: original tweets, replies, retweets/quotes. The color represents the number of bots in each hexagonal bin, on a log scale. (b) Same as (a), but for the human accounts.
  • Figure 4: Hashtags and accounts amplified by the fox8 bots. (a) Ten most frequent hashtags shared by fox8 bots in their recent tweets. (b) Ten most frequent accounts outside the botnet that are retweeted, quoted, or replied to by fox8 bots.
  • Figure 5: Websites shared by the fox8 bots. (a) Ten most frequent websites shared by fox8 bots in their recent tweets. (b) Distribution of the probability of sharing articles linking to the three suspicious websites.
  • ...and 3 more figures