Table of Contents
Fetching ...

AI Agent Smart Contract Exploit Generation

Arthur Gervais, Liyi Zhou

TL;DR

The paper addresses the challenge of scalable smart-contract security auditing by introducing A1, an end-to-end agentic system that turns general-purpose LLMs into autonomous exploit generators for DeFi contracts. Leveraging six domain-specific tools and a test-time execution loop on forked blockchain states, A1 discovers, tests, and monetizes real-world vulnerabilities with concrete PoCs and execution validation. Empirical evaluation across 36 incidents and six LLMs demonstrates competitive performance on VERITE (63% success) and substantial potential revenue, while revealing important dynamics around detection latency, economic incentives, and model behavior (including memorization versus reasoning). The work highlights the practical potential and risks of AI-driven vulnerability discovery and motivates future research on defense-oriented AI tooling, rapid detection pipelines, and hybrid fuzzing approaches to balance offense and defense in DeFi security.

Abstract

Smart contract vulnerabilities have led to billions in losses, yet finding actionable exploits remains challenging. Traditional fuzzers rely on rigid heuristics and struggle with complex attacks, while human auditors are thorough but slow and don't scale. Large Language Models offer a promising middle ground, combining human-like reasoning with machine speed. Early studies show that simply prompting LLMs generates unverified vulnerability speculations with high false positive rates. To address this, we present A1, an agentic system that transforms any LLM into an end-to-end exploit generator. A1 provides agents with six domain-specific tools for autonomous vulnerability discovery, from understanding contract behavior to testing strategies on real blockchain states. All outputs are concretely validated through execution, ensuring only profitable proof-of-concept exploits are reported. We evaluate A1 across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain. A1 achieves a 63% success rate on the VERITE benchmark. Across all successful cases, A1 extracts up to \$8.59 million per exploit and \$9.33 million total. Using Monte Carlo analysis of historical attacks, we demonstrate that immediate vulnerability detection yields 86-89% success probability, dropping to 6-21% with week-long delays. Our economic analysis reveals a troubling asymmetry: attackers achieve profitability at \$6,000 exploit values while defenders require \$60,000 -- raising fundamental questions about whether AI agents inevitably favor exploitation over defense.

AI Agent Smart Contract Exploit Generation

TL;DR

The paper addresses the challenge of scalable smart-contract security auditing by introducing A1, an end-to-end agentic system that turns general-purpose LLMs into autonomous exploit generators for DeFi contracts. Leveraging six domain-specific tools and a test-time execution loop on forked blockchain states, A1 discovers, tests, and monetizes real-world vulnerabilities with concrete PoCs and execution validation. Empirical evaluation across 36 incidents and six LLMs demonstrates competitive performance on VERITE (63% success) and substantial potential revenue, while revealing important dynamics around detection latency, economic incentives, and model behavior (including memorization versus reasoning). The work highlights the practical potential and risks of AI-driven vulnerability discovery and motivates future research on defense-oriented AI tooling, rapid detection pipelines, and hybrid fuzzing approaches to balance offense and defense in DeFi security.

Abstract

Smart contract vulnerabilities have led to billions in losses, yet finding actionable exploits remains challenging. Traditional fuzzers rely on rigid heuristics and struggle with complex attacks, while human auditors are thorough but slow and don't scale. Large Language Models offer a promising middle ground, combining human-like reasoning with machine speed. Early studies show that simply prompting LLMs generates unverified vulnerability speculations with high false positive rates. To address this, we present A1, an agentic system that transforms any LLM into an end-to-end exploit generator. A1 provides agents with six domain-specific tools for autonomous vulnerability discovery, from understanding contract behavior to testing strategies on real blockchain states. All outputs are concretely validated through execution, ensuring only profitable proof-of-concept exploits are reported. We evaluate A1 across 36 real-world vulnerable contracts on Ethereum and Binance Smart Chain. A1 achieves a 63% success rate on the VERITE benchmark. Across all successful cases, A1 extracts up to \9.33 million total. Using Monte Carlo analysis of historical attacks, we demonstrate that immediate vulnerability detection yields 86-89% success probability, dropping to 6-21% with week-long delays. Our economic analysis reveals a troubling asymmetry: attackers achieve profitability at \60,000 -- raising fundamental questions about whether AI agents inevitably favor exploitation over defense.

Paper Structure

This paper contains 33 sections, 1 equation, 7 figures, 12 tables, 2 algorithms.

Figures (7)

  • Figure 1: A1 accesses six tools: (i) a source code fetcher that resolves proxy contracts, (ii) a constructor parameter extractor, (iii) a state reader for querying functions, (iv) a code sanitizer that removes extraneous elements, (v) a concrete execution tool for validating exploit strategies, and (vi) a revenue normalizer that converts extracted tokens to native currency. Given target parameters (contract address, block number), A1 decides which tools to use and when. The agent generates exploits as compilable Solidity contracts and tests them on real historical blockchain states, using execution feedback to guide its reasoning.
  • Figure 2: Multi-turn agentic workflow for the sgETH incident. Gray <truncated> marks omitted lines; colored lines highlight important instructions.
  • Figure 3: Timing analysis. (a) Violin plots show execution time distributions by model. o3-pro is the slowest (mean: 34.0 min), often exceeding typical attack windows, while Gemini Flash is the fastest (mean: 5.9 min). (b) CDF plots compare exploit runtimes against historical attack-window durations on the VERITE dataset verite. A run is successful when its runtime is shorter than the residual attack window. Success probabilities are estimated via Monte Carlo sampling ($10^{5}$ random pairs per model), with 95% confidence intervals shown in parentheses. For example, without detection delay the success probabilities are: o3 88.5% (88.4–88.7%), o3-pro 85.9% (85.7–86.1%), Gemini Pro 88.8% (88.6–89.0%), R1 88.8% (88.6–89.0%), Qwen3 MoE 88.7% (88.5–88.9%), and Gemini Flash 88.8% (88.6–89.0%). Among the 19 incidents, 83% lasted longer than one hour (15/18) and 50% longer than 24 days (9/18). See Tables \ref{['tab:timing_stats']} and \ref{['tab:delay_success']} for full statistics.
  • Figure 4: Token usage analysis across 432 experiments with 16.8% success rate. Total estimated cost: $335.38. Violin plots show distribution of total tokens per experiment, split by success/failure. Max and min values are annotated on each violin. Costs calculated using published pricing per 1M tokens (reasoning tokens included in completion costs). See Table \ref{['tab:token_stats']} in Appendix \ref{['app:token']} for detailed statistics by model and iteration. Mean tokens per experiment (±std): o3 (73M ± 41M tokens, $0.35); o3-pro (74M ± 47M tokens, $3.59); Gemini Pro (114M ± 65M tokens, $0.56); Gemini Flash (132M ± 47M tokens, $0.03); R1 (82M ± 29M tokens, $0.10); Qwen3 MoE (84M ± 26M tokens, $0.03).
  • Figure 5: Economic viability analysis showing expected profit (USD) per analyzed contract as a function of detection delay (x-axis, days) and vulnerability incidence rate (y-axis, log scale). The incidence rate denotes how often exploitable vulnerabilities occur (e.g., 0.1% = 1 in 1000 contracts). Colors indicate expected profit, with white at break-even; black contours mark break-even boundaries. Assumptions: maximum revenue of $20k per exploit and costs set to the 95th percentile plus $3 overhead. Key results: o3-pro remains profitable up to 30 days at 0.6% incidence, while faster models require much higher rates ($\gg 1\%$). Overall, viability depends strongly on rapid detection and accurate targeting.
  • ...and 2 more figures