LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild
Reworr, Dmitrii Volkov
TL;DR
The paper addresses the emergent threat of autonomous AI hacking agents by introducing the LLM Agent Honeypot, a modified Cowrie SSH honeypot augmented with prompt injection and timing analysis to detect LLM-based attackers in the wild. It presents a multi-step detection methodology that combines active manipulation with timing cues and reports on a public deployment that gathered over 8 million interactions, identifying 8 potential AI-driven attacks. The work provides empirical evidence of AI-driven threats in real-world attack traffic and delivers a real-time public dashboard for monitoring, offering an early warning mechanism for defenders. Overall, the study establishes a foundation for understanding AI-enabled threat landscapes and motivates further research into robust detection and broader honeypot deployment.
Abstract
Attacks powered by Large Language Model (LLM) agents represent a growing threat to modern cybersecurity. To address this concern, we present LLM Honeypot, a system designed to monitor autonomous AI hacking agents. By augmenting a standard SSH honeypot with prompt injection and time-based analysis techniques, our framework aims to distinguish LLM agents among all attackers. Over a trial deployment of about three months in a public environment, we collected 8,130,731 hacking attempts and 8 potential AI agents. Our work demonstrates the emergence of AI-driven threats and their current level of usage, serving as an early warning of malicious LLM agents in the wild.
