Table of Contents
Fetching ...

Revealing Adversarial Smart Contracts through Semantic Interpretation and Uncertainty Estimation

Yating Liu, Xing Su, Hao Wu, Sijin Li, Yuxi Cheng, Fengyuan Xu, Sheng Zhong

TL;DR

This paper presents FinDet, a training-free framework for pre-deployment detection of adversarial smart contracts directly from EVM bytecode. FinDet lifts bytecode into semi-structured natural language, enabling two-stage analysis: a general-purpose/attack-specific semantic understanding and a probing, entropy-based uncertainty fusion to yield robust yes-no decisions. It introduces fund-flow reachability to capture the three-stage attack lifecycle and uses multi-view prompts to quantify LLM uncertainty, mitigating hallucinations. Empirical results show state-of-the-art performance (BAC up to 0.9374, TPR up to 0.9231) and strong generalization to unseen attack types, with real-world discovery of 29 adversarial contracts in a 10-day window. FinDet demonstrates practical viability for proactive DeFi security, maintaining performance under obfuscation and low-data settings while remaining compatible with multiple LLM backbones.

Abstract

Adversarial smart contracts, mostly on EVM-compatible chains like Ethereum and BSC, are deployed as EVM bytecode to exploit vulnerable smart contracts for financial gain. Detecting such malicious contracts at the time of deployment is an important proactive strategy to prevent losses from victim contracts. It offers a better cost-benefit ratio than detecting vulnerabilities on diverse potential victims. However, existing works are not generic with limited detection types and effectiveness due to imbalanced samples, while the emerging LLM technologies, which show their potential in generalization, have two key problems impeding its application in this task: hard digestion of compiled-code inputs, especially those with task-specific logic, and hard assessment of LLM's certainty in its binary (yes-or-no) answers. Therefore, we propose a generic adversarial smart contracts detection framework FinDet, which leverages LLM with two enhancements addressing the above two problems. FinDet takes as input only the EVM bytecode contracts and identifies adversarial ones among them with high balanced accuracy. The first enhancement extracts concise semantic intentions and high-level behavioral logic from the low-level bytecode inputs, unleashing the LLM reasoning capability restricted by the task input. The second enhancement probes and measures the LLM uncertainty to its multi-round answering to the same query, improving the LLM answering robustness for binary classifications required by the task output. Our comprehensive evaluation shows that FinDet achieves a BAC of 0.9374 and a TPR of 0.9231, significantly outperforming existing baselines. It remains robust under challenging conditions including unseen attack patterns, low-data settings, and feature obfuscation. FinDet detects all 5 public and 20+ unreported adversarial contracts in a 10-day real-world test, confirmed manually.

Revealing Adversarial Smart Contracts through Semantic Interpretation and Uncertainty Estimation

TL;DR

This paper presents FinDet, a training-free framework for pre-deployment detection of adversarial smart contracts directly from EVM bytecode. FinDet lifts bytecode into semi-structured natural language, enabling two-stage analysis: a general-purpose/attack-specific semantic understanding and a probing, entropy-based uncertainty fusion to yield robust yes-no decisions. It introduces fund-flow reachability to capture the three-stage attack lifecycle and uses multi-view prompts to quantify LLM uncertainty, mitigating hallucinations. Empirical results show state-of-the-art performance (BAC up to 0.9374, TPR up to 0.9231) and strong generalization to unseen attack types, with real-world discovery of 29 adversarial contracts in a 10-day window. FinDet demonstrates practical viability for proactive DeFi security, maintaining performance under obfuscation and low-data settings while remaining compatible with multiple LLM backbones.

Abstract

Adversarial smart contracts, mostly on EVM-compatible chains like Ethereum and BSC, are deployed as EVM bytecode to exploit vulnerable smart contracts for financial gain. Detecting such malicious contracts at the time of deployment is an important proactive strategy to prevent losses from victim contracts. It offers a better cost-benefit ratio than detecting vulnerabilities on diverse potential victims. However, existing works are not generic with limited detection types and effectiveness due to imbalanced samples, while the emerging LLM technologies, which show their potential in generalization, have two key problems impeding its application in this task: hard digestion of compiled-code inputs, especially those with task-specific logic, and hard assessment of LLM's certainty in its binary (yes-or-no) answers. Therefore, we propose a generic adversarial smart contracts detection framework FinDet, which leverages LLM with two enhancements addressing the above two problems. FinDet takes as input only the EVM bytecode contracts and identifies adversarial ones among them with high balanced accuracy. The first enhancement extracts concise semantic intentions and high-level behavioral logic from the low-level bytecode inputs, unleashing the LLM reasoning capability restricted by the task input. The second enhancement probes and measures the LLM uncertainty to its multi-round answering to the same query, improving the LLM answering robustness for binary classifications required by the task output. Our comprehensive evaluation shows that FinDet achieves a BAC of 0.9374 and a TPR of 0.9231, significantly outperforming existing baselines. It remains robust under challenging conditions including unseen attack patterns, low-data settings, and feature obfuscation. FinDet detects all 5 public and 20+ unreported adversarial contracts in a 10-day real-world test, confirmed manually.

Paper Structure

This paper contains 62 sections, 10 equations, 16 figures, 13 tables, 1 algorithm.

Figures (16)

  • Figure 1: Comparison of adversarial contract detection methods: (a) Rule-based methods rely on fixed patterns and fail on unexpected attacks. (b) ML-based methods leverage low-level features but struggle with few-shot or unseen cases, resulting in high FNR. (c) FinDet employs semantic reasoning for robust adversarial detection.
  • Figure 2: Three-phase adversarial lifecycle.
  • Figure 3: Token distribution of TAC on DeepSeek-V3.
  • Figure 4: Decompiled Solidity snippet of an adversarial contract (see https://bscscan.com/address/0x75f2002937507b826b727170728595fd45151d0f) involved in a flash loan attack on April 26, 2025. This case illustrates a typical flash loan exploit, following the three-phase fund-flow pattern. The right part visualizes the corresponding fund transfers across different addresses during the attack.
  • Figure 5: A high-level workflow of FinDet.
  • ...and 11 more figures