Table of Contents
Fetching ...

LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild

Reworr, Dmitrii Volkov

TL;DR

The paper addresses the emergent threat of autonomous AI hacking agents by introducing the LLM Agent Honeypot, a modified Cowrie SSH honeypot augmented with prompt injection and timing analysis to detect LLM-based attackers in the wild. It presents a multi-step detection methodology that combines active manipulation with timing cues and reports on a public deployment that gathered over 8 million interactions, identifying 8 potential AI-driven attacks. The work provides empirical evidence of AI-driven threats in real-world attack traffic and delivers a real-time public dashboard for monitoring, offering an early warning mechanism for defenders. Overall, the study establishes a foundation for understanding AI-enabled threat landscapes and motivates further research into robust detection and broader honeypot deployment.

Abstract

Attacks powered by Large Language Model (LLM) agents represent a growing threat to modern cybersecurity. To address this concern, we present LLM Honeypot, a system designed to monitor autonomous AI hacking agents. By augmenting a standard SSH honeypot with prompt injection and time-based analysis techniques, our framework aims to distinguish LLM agents among all attackers. Over a trial deployment of about three months in a public environment, we collected 8,130,731 hacking attempts and 8 potential AI agents. Our work demonstrates the emergence of AI-driven threats and their current level of usage, serving as an early warning of malicious LLM agents in the wild.

LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild

TL;DR

The paper addresses the emergent threat of autonomous AI hacking agents by introducing the LLM Agent Honeypot, a modified Cowrie SSH honeypot augmented with prompt injection and timing analysis to detect LLM-based attackers in the wild. It presents a multi-step detection methodology that combines active manipulation with timing cues and reports on a public deployment that gathered over 8 million interactions, identifying 8 potential AI-driven attacks. The work provides empirical evidence of AI-driven threats in real-world attack traffic and delivers a real-time public dashboard for monitoring, offering an early warning mechanism for defenders. Overall, the study establishes a foundation for understanding AI-enabled threat landscapes and motivates further research into robust detection and broader honeypot deployment.

Abstract

Attacks powered by Large Language Model (LLM) agents represent a growing threat to modern cybersecurity. To address this concern, we present LLM Honeypot, a system designed to monitor autonomous AI hacking agents. By augmenting a standard SSH honeypot with prompt injection and time-based analysis techniques, our framework aims to distinguish LLM agents among all attackers. Over a trial deployment of about three months in a public environment, we collected 8,130,731 hacking attempts and 8 potential AI agents. Our work demonstrates the emergence of AI-driven threats and their current level of usage, serving as an early warning of malicious LLM agents in the wild.

Paper Structure

This paper contains 23 sections, 13 figures.

Figures (13)

  • Figure 1: Success rate by prompt injection type
  • Figure 2: Success rate by prompt injection goal
  • Figure 3: Internal Evaluations of GPT-4o LLM Agents
  • Figure 4: Timing Analysis of all bots in the wild
  • Figure 5: Honeypot Detection Scheme
  • ...and 8 more figures