AI Agent Smart Contract Exploit Generation
Arthur Gervais, Liyi Zhou
TL;DR
The paper addresses the challenge of scalable smart-contract security auditing by introducing A1, an end-to-end agentic system that turns general-purpose LLMs into autonomous exploit generators for DeFi contracts. Leveraging six domain-specific tools and a test-time execution loop on forked blockchain states, A1 discovers, tests, and monetizes real-world vulnerabilities with concrete PoCs and execution validation. Empirical evaluation across 36 incidents and six LLMs demonstrates competitive performance on VERITE (63% success) and substantial potential revenue, while revealing important dynamics around detection latency, economic incentives, and model behavior (including memorization versus reasoning). The work highlights the practical potential and risks of AI-driven vulnerability discovery and motivates future research on defense-oriented AI tooling, rapid detection pipelines, and hybrid fuzzing approaches to balance offense and defense in DeFi security.
Abstract
Smart contract vulnerabilities have led to billions in losses, yet finding actionable exploits remains challenging. Traditional fuzzers rely on rigid heuristics and struggle with complex attacks, while human auditors are thorough but slow and don't scale. Large Language Models offer a promising middle ground, combining human-like reasoning with machine speed. Early studies show that simply prompting LLMs generates unverified vulnerability speculations with high false positive rates. To address this, we present A1, an agentic system that transforms any LLM into an end-to-end exploit generator. A1 provides agents with six domain-specific tools for autonomous vulnerability discovery, from understanding contract behavior to testing strategies on real blockchain states. All outputs are concretely validated through execution, ensuring only profitable proof-of-concept exploits are reported. We evaluate A1 across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain. A1 achieves a 63% success rate on the VERITE benchmark. Across all successful cases, A1 extracts up to \$8.59 million per exploit and \$9.33 million total. Using Monte Carlo analysis of historical attacks, we demonstrate that immediate vulnerability detection yields 86-89% success probability, dropping to 6-21% with week-long delays. Our economic analysis reveals a troubling asymmetry: attackers achieve profitability at \$6,000 exploit values while defenders require \$60,000 -- raising fundamental questions about whether AI agents inevitably favor exploitation over defense.
