Anatomy of an AI-powered malicious social botnet
Kai-Cheng Yang, Filippo Menczer
TL;DR
This paper presents a real-world case study of Twitter botnets powered by state-of-the-art language models, identifying a dense cluster of 1,140 accounts (the fox8 botnet) linked to three suspicious domains and analyzed via self-revealing prompts. It documents coordinated following, inter-bot replies/retweets, and the use of ChatGPT to generate content promoting dubious sites, while showing that current LLM-content detectors struggle to separate these bots from humans in the wild. The authors evaluate detectors such as Botometer, OpenAI's AI Text Classifier, and GPTZero, finding Botometer ineffective against LLM-powered bots and GPTZero unreliable in field conditions, though the OpenAI detector shows promise at tweet-level with caveats. They propose an account-level detection approach based on per-tweet OpenAI scores, achieving an F1 of 0.84 on the fox8-23 dataset, but emphasize the need for larger, multilingual, field-captured data to generalize beyond this botnet. Overall, the work highlights practical threats posed by LLM-powered social bots and outlines methodological and regulatory paths to improve detection and resilience as AI-enabled manipulation disseminates online.
Abstract
Large language models (LLMs) exhibit impressive capabilities in generating realistic text across diverse subjects. Concerns have been raised that they could be utilized to produce fake content with a deceptive intention, although evidence thus far remains anecdotal. This paper presents a case study about a Twitter botnet that appears to employ ChatGPT to generate human-like content. Through heuristics, we identify 1,140 accounts and validate them via manual annotation. These accounts form a dense cluster of fake personas that exhibit similar behaviors, including posting machine-generated content and stolen images, and engage with each other through replies and retweets. ChatGPT-generated content promotes suspicious websites and spreads harmful comments. While the accounts in the AI botnet can be detected through their coordination patterns, current state-of-the-art LLM content classifiers fail to discriminate between them and human accounts in the wild. These findings highlight the threats posed by AI-enabled social bots.
