SmartPoC: Generating Executable and Validated PoCs for Smart Contract Bug Reports
Longfei Chen, Ruibin Yan, Taiyu Wong, Yiyang Chen, Chao Zhang
TL;DR
SmartPoC tackles the verification gap in smart-contract auditing by converting textual static-analysis findings into executable and validated PoCs. It introduces BCE to produce bug-focused context, the GRE-Engine to iteratively generate and repair PoCs, and DV to provide a runtime, differential oracle for exploitability validation. Across SmartBugs-Vul, FORGE-Vul, and Latest-114, SmartPoC achieves high accuracy on ground-truth data, reliable validation of static findings, and real-world effectiveness at a low cost. The approach yields practical impact by enabling scalable, automated verification of complex logic-centric vulnerabilities and by releasing a dataset of validated PoCs to advance future work.
Abstract
Smart contracts are prone to vulnerabilities and are analyzed by experts as well as automated systems, such as static analysis and AI-assisted solutions. However, audit artifacts are heterogeneous and often lack reproducible, executable PoC tests suitable for automated validation, leading to costly, ad hoc manual verification. Large language models (LLMs) can be leveraged to turn audit reports into PoC test cases, but have three major challenges: noisy inputs, hallucinations, and missing runtime oracles. In this paper, we present SmartPoC, an automated framework that converts textual audit reports into executable, validated test cases. First, the input audit report is processed to reduce noise, and only bug-related functions are extracted and fed to LLMs as context. To curb hallucinations and ensure compile-and-run readiness, we leverage LLMs to synthesize PoC test cases with specially-designed pre-/post-execution repair. We further utilize differential verification as oracles to confirm exploitability of the PoC test cases. On the SmartBugs-Vul and FORGE-Vul benchmarks, SmartPoC generates executable, validated Foundry test cases for 85.61% and 86.45% of targets, respectively. Applied to the latest Etherscan verified-source corpus, SmartPoC confirms 236 real bugs out of 545 audit findings at a cost of only $0.03 per finding.
