Incalmo: An Autonomous LLM-assisted System for Red Teaming Multi-Host Networks
Brian Singer, Keane Lucas, Lakshmi Adiga, Meghna Jain, Lujo Bauer, Vyas Sekar
TL;DR
This work addresses the challenge of autonomously executing red-team attacks across multi-host networks, where prior LLM-based systems struggle due to low-level command focus and context-bloat. Incalmo proposes a two-layer architecture that decouples planning from execution, using high-level declarative tasks and domain-specific agents guided by auxiliary services (environment state, attack graph, and C&C) to manage knowledge and assets. Through MHBench, a 40-environment multi-host benchmark, Incalmo achieves 37/40 successful attacks with rapid execution and low cost, vastly outperforming baselines that only reach 3/40. The results highlight the importance of abstraction, modular agents, and robust context management for scalable, autonomous red-teaming, and the work provides open-source tools to advance reproducibility and further research.
Abstract
Security operators use red teams to simulate real attackers and proactively find defense gaps. In realistic enterprise settings, this involves executing multi-host network attacks spanning many "stepping stone" hosts. Unfortunately, red teams are expensive and entail significant expertise and effort. Given the promise of LLMs in CTF challenges, we first analyze if LLMs can autonomously execute multi-host red team exercises. We find that state-of-the-art LLM-assisted offense systems (e.g., PentestGPT, CyberSecEval3) with leading LLMs (e.g., Sonnet 4, Gemini 2.5 Pro) are unable to do so. Building on our observations in understanding the failure modes of state-of-the-art systems, we argue the need to improve the abstractions and interfaces for LLM-assisted red teaming. Based on this insight, we present the design and implementation of Incalmo, an LLM-assisted system for autonomously red teaming multi-host networks. Incalmo uses LLMs to plan red team exercises in terms of high-level declarative tasks that are executed by domain-specific task agents. Incalmo also uses auxiliary services to manage context and acquired assets. For our evaluation, we develop MHBench, a novel multi-host attack benchmark with 40 realistic emulated networks (from 22 to 50 hosts). We find that Incalmo successfully acquires critical assets (i.e., key hosts or data) in 37 out of 40 MHBench environments. In contrast, state-of-the-art LLM-assisted systems succeed in only 3 out of 40 environments. We show that Incalmo is efficient-successful attacks took 12-54 minutes and cost <$15 in LLM credits.
