Table of Contents
Fetching ...

Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM

Jiuyang Bu, Wenkai Li, Zongwei Li, Zeng Zhang, Xiaoqi Li

TL;DR

Real-world DApps suffer vulnerabilities in smart contracts, many of which are machine-unauditable. The authors fine-tune LLMs (Llama3-8B, Qwen2-7B) with FFT and LoRA on a large real-world dataset of 215 DApps (4,998 contracts), complemented by ROS data augmentation and a dual-audit prompt design. Findings show FFT-based models achieve high performance (F1 up to 0.83 with ROS) and outperform prompt-based baselines and state-of-the-art tools; price-manipulation vulnerabilities reach precision 0.97. The work demonstrates that domain-specific LLM fine-tuning and data augmentation can robustly detect non-machine-auditable vulnerabilities in real-world DApps, offering a practical path toward blockchain ecosystem protection.

Abstract

Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts, with traditional detection methods struggling to address emerging and machine-unauditable flaws. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection. We introduce a comprehensive dataset of 215 real-world DApp projects (4,998 contracts), including hard-to-detect logical errors like token price manipulation, addressing the limitations of existing simplified benchmarks. By fine-tuning LLMs (Llama3-8B and Qwen2-7B) with Full-Parameter Fine-Tuning (FFT) and Low-Rank Adaptation (LoRA), our method achieves superior performance, attaining an F1-score of 0.83 with FFT and data augmentation via Random Over Sampling (ROS). Comparative experiments demonstrate significant improvements over prompt-based LLMs and state-of-the-art tools. Notably, the approach excels in detecting non-machine-auditable vulnerabilities, achieving 0.97 precision and 0.68 recall for price manipulation flaws. The results underscore the effectiveness of domain-specific LLM fine-tuning and data augmentation in addressing real-world DApp security challenges, offering a robust solution for blockchain ecosystem protection.

Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM

TL;DR

Real-world DApps suffer vulnerabilities in smart contracts, many of which are machine-unauditable. The authors fine-tune LLMs (Llama3-8B, Qwen2-7B) with FFT and LoRA on a large real-world dataset of 215 DApps (4,998 contracts), complemented by ROS data augmentation and a dual-audit prompt design. Findings show FFT-based models achieve high performance (F1 up to 0.83 with ROS) and outperform prompt-based baselines and state-of-the-art tools; price-manipulation vulnerabilities reach precision 0.97. The work demonstrates that domain-specific LLM fine-tuning and data augmentation can robustly detect non-machine-auditable vulnerabilities in real-world DApps, offering a practical path toward blockchain ecosystem protection.

Abstract

Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts, with traditional detection methods struggling to address emerging and machine-unauditable flaws. This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection. We introduce a comprehensive dataset of 215 real-world DApp projects (4,998 contracts), including hard-to-detect logical errors like token price manipulation, addressing the limitations of existing simplified benchmarks. By fine-tuning LLMs (Llama3-8B and Qwen2-7B) with Full-Parameter Fine-Tuning (FFT) and Low-Rank Adaptation (LoRA), our method achieves superior performance, attaining an F1-score of 0.83 with FFT and data augmentation via Random Over Sampling (ROS). Comparative experiments demonstrate significant improvements over prompt-based LLMs and state-of-the-art tools. Notably, the approach excels in detecting non-machine-auditable vulnerabilities, achieving 0.97 precision and 0.68 recall for price manipulation flaws. The results underscore the effectiveness of domain-specific LLM fine-tuning and data augmentation in addressing real-world DApp security challenges, offering a robust solution for blockchain ecosystem protection.

Paper Structure

This paper contains 16 sections, 3 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Basic process of LLM detection vulnerability
  • Figure 2: LLM Dual Audit-Verification for Smart Contract Vulnerabilities
  • Figure 3: Dialogue Format for LLM-Based Smart Contract Audits
  • Figure 4: Basic Prompts
  • Figure 5: Code Detection Prompts
  • ...and 1 more figures