Table of Contents
Fetching ...

Human, AI, and Hybrid Ensembles for Detection of Adaptive, RL-based Social Bots

Valerio La Gatta, Nathan Subrahmanian, Kaitlyn Wang, Larry Birnbaum, V. S. Subrahmanian

Abstract

The use of reinforcement learning to dynamically adapt and evade detection is now well-documented in several cybersecurity settings including Covert Social Influence Operations (CSIOs), in which bots try to spread disinformation. While AI bot detectors have improved greatly, they are largely limited to detecting static bots that do not adapt dynamically. We present the first systematic study comparing the ability of humans, AI models, and hybrid Human-AI ensembles in detecting adaptive bots powered by reinforcement learning. Using data from a controlled, IRB-approved, five-day experiment with participants interacting on a social media platform infiltrated by RL-trained bots spreading disinformation to influence participants on 4 topics, we examine factors potentially shaping human detection capabilities: demographic characteristics, temporal learning effects, social network position, engagement patterns, and collective intelligence mechanisms. We first test 13 hypotheses comparing human bot detection performance against state-of-the-art AI approaches utilizing both traditional machine learning and large language models. We further investigate several aggregation strategies that combine human reports of bots with AI predictions, as well as retraining protocols that leverage human supervision. Our findings challenge intuitive assumptions about bot detection, reveal unexpected patterns in how humans identify bots, and show that combining human bot reports with AI predictions outperforms humans alone and AI alone. We conclude with a discussion of the practical implications of these results for industry.

Human, AI, and Hybrid Ensembles for Detection of Adaptive, RL-based Social Bots

Abstract

The use of reinforcement learning to dynamically adapt and evade detection is now well-documented in several cybersecurity settings including Covert Social Influence Operations (CSIOs), in which bots try to spread disinformation. While AI bot detectors have improved greatly, they are largely limited to detecting static bots that do not adapt dynamically. We present the first systematic study comparing the ability of humans, AI models, and hybrid Human-AI ensembles in detecting adaptive bots powered by reinforcement learning. Using data from a controlled, IRB-approved, five-day experiment with participants interacting on a social media platform infiltrated by RL-trained bots spreading disinformation to influence participants on 4 topics, we examine factors potentially shaping human detection capabilities: demographic characteristics, temporal learning effects, social network position, engagement patterns, and collective intelligence mechanisms. We first test 13 hypotheses comparing human bot detection performance against state-of-the-art AI approaches utilizing both traditional machine learning and large language models. We further investigate several aggregation strategies that combine human reports of bots with AI predictions, as well as retraining protocols that leverage human supervision. Our findings challenge intuitive assumptions about bot detection, reveal unexpected patterns in how humans identify bots, and show that combining human bot reports with AI predictions outperforms humans alone and AI alone. We conclude with a discussion of the practical implications of these results for industry.
Paper Structure (20 sections, 3 equations, 8 figures, 8 tables)

This paper contains 20 sections, 3 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Human bot detection performance across the five-day experiment. (a) Day-specific: accounts classified as bots if reported on that specific day. (b) Cumulative: accounts classified as bots if reported on any day up to and including the current day.
  • Figure 2: Relationship between bot interaction patterns and detection performance. (a-b) Bot Engagement Ratio (BER): proportion of user's outgoing interactions (follows-likes) directed toward bots. (c-d) Bot Exposure Ratio (BXR): proportion of incoming interactions (follows-likes) received from bots.
  • Figure 3: Conditional probability of bot given it was reported ${k}$ times, ${P(bot|k)}$. The red dashed line shows the overall proportion of bots in the original experiment (0.262).
  • Figure 4: Human vs AI Comparison: Pairwise Agreement Rate (a) and Cohen's $\kappa$ (b) for humans and the best performing detectors, i.e., BotBuster (trained on Twibot-20), RFS (trained on Caverlee-2011), LLM-based (llama:70b), and LLM-based (mistral-24b).
  • Figure 5: Relative F1-score improvement over baseline (no retraining) across experimental days for four detectors pretrained on Twibot-20. Each subplot shows one detector. Ground-Truth Supervision retrains on correct predictions, Self-Supervision retrains on high-confidence predictions (threshold 0.7), and Human Supervision retrains on human reports.
  • ...and 3 more figures