Table of Contents
Fetching ...

Bounded Exhaustive Random Program Generation for Testing Solidity Compilers

Haoyang Ma, Alastair F. Donaldson, Qingchao Shen, Yongqiang Tian, Junjie Chen, Shing-Chi Cheung

TL;DR

The paper tackles the vulnerability of Solidity compilers by addressing the inefficiency of traditional random program generation, which often misses bug-prone subspaces. It introduces bounded exhaustive random program generation implemented as Erwin, a two-stage process that first synthesizes valid program templates with qualifier placeholders and constraints, then exhaustively enumerates all permissible qualifier substitutions to produce concrete test programs, aided by a constraint-reduction mechanism. Empirical evaluation shows Erwin discovers 26 bugs across solc, solang, and Slither (with 23 previously unknown and 10 fixed), and delivers significantly higher coverage than unit tests and state-of-the-art fuzzers, while also providing seeds to enhance fuzzing pipelines. The approach demonstrates strong bug-detection efficiency and cross-system effectiveness, with potential to generalize to other languages and compilers, and suggests directions for expanding Solidity constructs, improving specifications, and integrating with mutation-based fuzzers for broader software testing.

Abstract

By July 2025, smart contracts collectively manage roughly $120 billion in assets. With Solidity remaining the dominant language for smart contract development, the correctness of Solidity compilers has become critically important. However, Solidity compilers are bug-prone, with a recent study revealing that combinations of qualifiers in Solidity programs are the primary cause of compiler crashes, accounting for 40.5% of all historical crashes. While random program generators are widely used for compiler testing, they may be less effective at finding Solidity compiler bugs because they explore the unbounded space of possible programs rather than concentrating on the specific subspace related to bug-prone qualifiers. A promising idea for finding qualifier-related bugs is to bound the search space based on empirical evidence of where such bugs are likely to occur, specifically focusing test generation to target subspaces with rich combinations of qualifiers. To address this, we propose bounded exhaustive random program generation, a novel approach that dynamically bounds the search space, enhancing the likelihood of uncovering Solidity compiler bugs. Specifically, our method bounds the search space by generating valid program templates that abstract programs that use bug-prone qualifiers, and then uses these templates as a basis for compiler testing through exhaustive enumeration of suitable qualifiers. Mechanisms are devised to address technical challenges regarding validity and efficiency. We have implemented our novel generation approach in a new tool, Erwin. We have used Erwin to find and report 26 bugs across two Solidity compilers, solc and solang, and one Solidity static analyzer, slither. Among these, 23 were previously unknown, 18 have been confirmed, and 10 have been fixed. Evaluation results demonstrate that Erwin outperforms state-of-the-art Solidity fuzzers in bug detection.

Bounded Exhaustive Random Program Generation for Testing Solidity Compilers

TL;DR

The paper tackles the vulnerability of Solidity compilers by addressing the inefficiency of traditional random program generation, which often misses bug-prone subspaces. It introduces bounded exhaustive random program generation implemented as Erwin, a two-stage process that first synthesizes valid program templates with qualifier placeholders and constraints, then exhaustively enumerates all permissible qualifier substitutions to produce concrete test programs, aided by a constraint-reduction mechanism. Empirical evaluation shows Erwin discovers 26 bugs across solc, solang, and Slither (with 23 previously unknown and 10 fixed), and delivers significantly higher coverage than unit tests and state-of-the-art fuzzers, while also providing seeds to enhance fuzzing pipelines. The approach demonstrates strong bug-detection efficiency and cross-system effectiveness, with potential to generalize to other languages and compilers, and suggests directions for expanding Solidity constructs, improving specifications, and integrating with mutation-based fuzzers for broader software testing.

Abstract

By July 2025, smart contracts collectively manage roughly $120 billion in assets. With Solidity remaining the dominant language for smart contract development, the correctness of Solidity compilers has become critically important. However, Solidity compilers are bug-prone, with a recent study revealing that combinations of qualifiers in Solidity programs are the primary cause of compiler crashes, accounting for 40.5% of all historical crashes. While random program generators are widely used for compiler testing, they may be less effective at finding Solidity compiler bugs because they explore the unbounded space of possible programs rather than concentrating on the specific subspace related to bug-prone qualifiers. A promising idea for finding qualifier-related bugs is to bound the search space based on empirical evidence of where such bugs are likely to occur, specifically focusing test generation to target subspaces with rich combinations of qualifiers. To address this, we propose bounded exhaustive random program generation, a novel approach that dynamically bounds the search space, enhancing the likelihood of uncovering Solidity compiler bugs. Specifically, our method bounds the search space by generating valid program templates that abstract programs that use bug-prone qualifiers, and then uses these templates as a basis for compiler testing through exhaustive enumeration of suitable qualifiers. Mechanisms are devised to address technical challenges regarding validity and efficiency. We have implemented our novel generation approach in a new tool, Erwin. We have used Erwin to find and report 26 bugs across two Solidity compilers, solc and solang, and one Solidity static analyzer, slither. Among these, 23 were previously unknown, 18 have been confirmed, and 10 have been fixed. Evaluation results demonstrate that Erwin outperforms state-of-the-art Solidity fuzzers in bug detection.

Paper Structure

This paper contains 35 sections, 13 equations, 16 figures, 3 tables, 2 algorithms.

Figures (16)

  • Figure 1: Random program generation
  • Figure 2: Bug-triggering test program for GitHub issue #9134 bug9134
  • Figure 3: Workflow of Erwin
  • Figure 4: Syntax of templates, where $\overline{x}$ denotes a sequence of elements each of the form $x$
  • Figure 5: Qualifier codomain initialization in Solidity
  • ...and 11 more figures

Theorems & Definitions (3)

  • Definition 3.1: Constraint Set
  • Example 3.1
  • Example 3.2