RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents
Sho Nakatani
TL;DR
RapidPen tackles the problem of fully automated IP-to-Shell penetration testing by marrying ReAct-style task planning with retrieval-augmented knowledge of exploits. The system uses a dual-layer data model (PTT) and two dedicated RAG repositories to plan and execute multi-step attacks with iterative feedback in the Act module, enabling autonomous exploitation starting from a single IP. In preliminary HTB Legacy experiments, RapidPen achieved a 60% success rate when leveraging prior success cases, typically delivering a shell within 200–400 seconds at a cost of about $0.3–$0.6 per run, illustrating practical speed and affordability gains. The work demonstrates the potential to democratize pentesting for non-experts while offloading repetitive tasks for professionals, and outlines concrete directions for expanding scope, improving robustness, and approaching real-world deployment.
Abstract
We present RapidPen, a fully automated penetration testing (pentesting) framework that addresses the challenge of achieving an initial foothold (IP-to-Shell) without human intervention. Unlike prior approaches that focus primarily on post-exploitation or require a human-in-the-loop, RapidPen leverages large language models (LLMs) to autonomously discover and exploit vulnerabilities, starting from a single IP address. By integrating advanced ReAct-style task planning (Re) with retrieval-augmented knowledge bases of successful exploits, along with a command-generation and direct execution feedback loop (Act), RapidPen systematically scans services, identifies viable attack vectors, and executes targeted exploits in a fully automated manner. In our evaluation against a vulnerable target from the Hack The Box platform, RapidPen achieved shell access within 200-400 seconds at a per-run cost of approximately \$0.3-\$0.6, demonstrating a 60\% success rate when reusing prior "success-case" data. These results underscore the potential of truly autonomous pentesting for both security novices and seasoned professionals. Organizations without dedicated security teams can leverage RapidPen to quickly identify critical vulnerabilities, while expert pentesters can offload repetitive tasks and focus on complex challenges. Ultimately, our work aims to make penetration testing more accessible and cost-efficient, thereby enhancing the overall security posture of modern software ecosystems.
