Table of Contents
Fetching ...

Evaluating the Vulnerability Landscape of LLM-Generated Smart Contracts

Hoang Long Do, Nasrin Sohrabi, Muneeb Ul Hassan

TL;DR

The paper tackles the risk of security flaws in Solidity contracts generated by state-of-the-art LLMs. It introduces a full pipeline—contract collection, semantic feature extraction, prompt mutation, multi-model code generation, and Slither-based vulnerability analysis—applied to 1,033 contracts across DeFi, governance, and utility domains. Key findings show high vulnerability rates across GPT-4.1, Gemini-2.5, and Sonnet-4.5, with larger contracts more prone to issues, underscoring that syntactic correctness does not ensure security. The work delivers a vulnerability taxonomy, threat model, and practical guidelines for integrating AI-assisted contract generation with rigorous auditing and verification to improve production-ready security in blockchain ecosystems.

Abstract

Large language models (LLMs) have been widely adopted in modern software development lifecycles, where they are increasingly used to automate and assist code generation, significantly improving developer productivity and reducing development time. In the blockchain domain, developers increasingly rely on LLMs to generate and maintain smart contracts, the immutable, self-executing components of decentralized applications. Because deployed smart contracts cannot be modified, correctness and security are paramount, particularly in high-stakes domains such as finance and governance. Despite this growing reliance, the security implications of LLM-generated smart contracts remain insufficiently understood. In this work, we conduct a systematic security analysis of Solidity smart contracts generated by state-of-the-art LLMs, including ChatGPT, Gemini, and Sonnet. We evaluate these contracts against a broad set of known smart contract vulnerabilities to assess their suitability for direct deployment in production environments. Our extensive experimental study shows that, despite their syntactic correctness and functional completeness, LLM-generated smart contracts frequently exhibit severe security flaws that could be exploited in real-world settings. We further analyze and categorize these vulnerabilities, identifying recurring weakness patterns across different models. Finally, we discuss practical countermeasures and development guidelines to help mitigate these risks, offering actionable insights for both developers and researchers. Our findings aim to support safe integration of LLMs into smart contract development workflows and to strengthen the overall security of the blockchain ecosystem against future security failures.

Evaluating the Vulnerability Landscape of LLM-Generated Smart Contracts

TL;DR

The paper tackles the risk of security flaws in Solidity contracts generated by state-of-the-art LLMs. It introduces a full pipeline—contract collection, semantic feature extraction, prompt mutation, multi-model code generation, and Slither-based vulnerability analysis—applied to 1,033 contracts across DeFi, governance, and utility domains. Key findings show high vulnerability rates across GPT-4.1, Gemini-2.5, and Sonnet-4.5, with larger contracts more prone to issues, underscoring that syntactic correctness does not ensure security. The work delivers a vulnerability taxonomy, threat model, and practical guidelines for integrating AI-assisted contract generation with rigorous auditing and verification to improve production-ready security in blockchain ecosystems.

Abstract

Large language models (LLMs) have been widely adopted in modern software development lifecycles, where they are increasingly used to automate and assist code generation, significantly improving developer productivity and reducing development time. In the blockchain domain, developers increasingly rely on LLMs to generate and maintain smart contracts, the immutable, self-executing components of decentralized applications. Because deployed smart contracts cannot be modified, correctness and security are paramount, particularly in high-stakes domains such as finance and governance. Despite this growing reliance, the security implications of LLM-generated smart contracts remain insufficiently understood. In this work, we conduct a systematic security analysis of Solidity smart contracts generated by state-of-the-art LLMs, including ChatGPT, Gemini, and Sonnet. We evaluate these contracts against a broad set of known smart contract vulnerabilities to assess their suitability for direct deployment in production environments. Our extensive experimental study shows that, despite their syntactic correctness and functional completeness, LLM-generated smart contracts frequently exhibit severe security flaws that could be exploited in real-world settings. We further analyze and categorize these vulnerabilities, identifying recurring weakness patterns across different models. Finally, we discuss practical countermeasures and development guidelines to help mitigate these risks, offering actionable insights for both developers and researchers. Our findings aim to support safe integration of LLMs into smart contract development workflows and to strengthen the overall security of the blockchain ecosystem against future security failures.
Paper Structure (32 sections, 1 equation, 6 figures, 6 tables, 3 algorithms)

This paper contains 32 sections, 1 equation, 6 figures, 6 tables, 3 algorithms.

Figures (6)

  • Figure 1: Scenario where security issues caused serious damage to critical systems.
  • Figure 2: Proposed streamlined system for vulnerability testing of agent-generated smart contracts. It comprises five modules: smart contract collection, feature extraction, prompt-generation, agent-driven code generation, and vulnerability analysis.
  • Figure 3: A sample smart contract used in Analysis
  • Figure 4: Vulnerability spawn rate of each chosen models.
  • Figure 5: Line-of-Code distribution in vulnerable contracts.
  • ...and 1 more figures