Table of Contents
Fetching ...

What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection

Shangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov

TL;DR

This work interrogates the dual-use potential of large language models (LLMs) in social media bot detection. It introduces a mixture-of-heterogeneous-experts framework that deploys modality-specific LLMs to analyze metadata, text, and network information, with in-context learning or instruction tuning and ensemble voting enabling state-of-the-art detection on TwiBot-20 and TwiBot-22. The study also analyzes risks by detailing LLM-guided textual and structural manipulations that can significantly degrade detector performance and calibrations, highlighting critical dual-use concerns. Empirically, instruction-tuned LLMs achieve up to 9.1% improvements in F1 over baselines, while manipulations can reduce performance by as much as 29.6% and worsen calibration, underscoring the need for robust defense mechanisms and policy considerations in deploying LLM-based detectors.

Abstract

Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection. In this work, we bring the arms race to the next level by investigating the opportunities and risks of state-of-the-art large language models (LLMs) in social bot detection. To investigate the opportunities, we design novel LLM-based bot detectors by proposing a mixture-of-heterogeneous-experts framework to divide and conquer diverse user information modalities. To illuminate the risks, we explore the possibility of LLM-guided manipulation of user textual and structured information to evade detection. Extensive experiments with three LLMs on two datasets demonstrate that instruction tuning on merely 1,000 annotated examples produces specialized LLMs that outperform state-of-the-art baselines by up to 9.1% on both datasets, while LLM-guided manipulation strategies could significantly bring down the performance of existing bot detectors by up to 29.6% and harm the calibration and reliability of bot detection systems.

What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection

TL;DR

This work interrogates the dual-use potential of large language models (LLMs) in social media bot detection. It introduces a mixture-of-heterogeneous-experts framework that deploys modality-specific LLMs to analyze metadata, text, and network information, with in-context learning or instruction tuning and ensemble voting enabling state-of-the-art detection on TwiBot-20 and TwiBot-22. The study also analyzes risks by detailing LLM-guided textual and structural manipulations that can significantly degrade detector performance and calibrations, highlighting critical dual-use concerns. Empirically, instruction-tuned LLMs achieve up to 9.1% improvements in F1 over baselines, while manipulations can reduce performance by as much as 29.6% and worsen calibration, underscoring the need for robust defense mechanisms and policy considerations in deploying LLM-based detectors.

Abstract

Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection. In this work, we bring the arms race to the next level by investigating the opportunities and risks of state-of-the-art large language models (LLMs) in social bot detection. To investigate the opportunities, we design novel LLM-based bot detectors by proposing a mixture-of-heterogeneous-experts framework to divide and conquer diverse user information modalities. To illuminate the risks, we explore the possibility of LLM-guided manipulation of user textual and structured information to evade detection. Extensive experiments with three LLMs on two datasets demonstrate that instruction tuning on merely 1,000 annotated examples produces specialized LLMs that outperform state-of-the-art baselines by up to 9.1% on both datasets, while LLM-guided manipulation strategies could significantly bring down the performance of existing bot detectors by up to 29.6% and harm the calibration and reliability of bot detection systems.
Paper Structure (50 sections, 6 figures, 20 tables)

This paper contains 50 sections, 6 figures, 20 tables.

Figures (6)

  • Figure 1: Overview of the opportunities of LLM-based bot detectors and risks of LLM-based evasive bots.
  • Figure 2: Calibration of LLM-based bot detectors with the original Twibot-20 dataset as well as the manipulated version with both combine. ECE denotes estimated calibration error, the lower the better. The dashed line indicates perfect calibration, while the color of the bar is darker when it is closer to perfect calibration.
  • Figure 3: GPT-4 Evaluation of whether the LLM-paraphrased bot post is similar to the original post in content, from "very different" as 1 to "very similar" as 4. We present the average value and standard deviation.
  • Figure 4: The trend of bot likelihood scores given by the external classifier in the classifier guidance strategy of paraphrasing bot posts.
  • Figure 5: Distributions of accounts' metadata that are selected by LLMs to be added/removed from a bot account's following list.
  • ...and 1 more figures