Construction and Evaluation of LLM-based agents for Semi-Autonomous penetration testing
Masaya Kobayashi, Masane Fuchi, Amar Zanashir, Tomonori Yoneda, Tomohiro Takagi
TL;DR
The paper tackles the challenge of achieving semi-autonomy in penetration testing using LLM-based agents. It introduces a three-module architecture (planning, execution, summarization) with retrieval-augmented generation and MITRE ATT&CK-based planning (PTT) to coordinate attack workflows. The authors demonstrate that the system can autonomously generate attack strategies and commands on Hack The Box targets, albeit with qualitative validation and human oversight for sensitive steps. Key contributions include integrating RAG with multi-LLM coordination and a data-flow that updates execution plans with real results, highlighting practical potential while acknowledging limitations. Future work points to knowledge graphs, phase-specific agents, and deeper tool integration to enhance robustness and scalability of autonomous pentesting.
Abstract
With the emergence of high-performance large language models (LLMs) such as GPT, Claude, and Gemini, the autonomous and semi-autonomous execution of tasks has significantly advanced across various domains. However, in highly specialized fields such as cybersecurity, full autonomy remains a challenge. This difficulty primarily stems from the limitations of LLMs in reasoning capabilities and domain-specific knowledge. We propose a system that semi-autonomously executes complex cybersecurity workflows by employing multiple LLMs modules to formulate attack strategies, generate commands, and analyze results, thereby addressing the aforementioned challenges. In our experiments using Hack The Box virtual machines, we confirmed that our system can autonomously construct attack strategies, issue appropriate commands, and automate certain processes, thereby reducing the need for manual intervention.
