Table of Contents
Fetching ...

A Comparative Study of Fuzzers and Static Analysis Tools for Finding Memory Unsafety in C and C++

Keno Hassler, Philipp Görz, Stephan Lipp, Thorsten Holz, Marcel Böhme

TL;DR

Memory-safety vulnerabilities in C/C++ remain a critical risk in modern software. This paper presents the first systematic cross-domain comparison of fuzzing and static analysis using the Magma benchmark (112 CVEs across seven programs) and 5 static analyzers plus 13 fuzzers, aiming to quantify true positives, overhead, and complementarity. Findings show fuzzers and static analyzers detect largely different bugs, with AFL++ and CodeQL leading their respective classes; combining them yields more comprehensive coverage, albeit with high manual effort for false positives and deduplication. The work provides practical guidance for maintainers, discusses integration into development workflows, and outlines future research directions to foster collaboration between fuzzing and static-analysis communities. Overall, the results suggest adopting a hybrid bug-finding strategy and progressing toward safer language designs to improve memory-safety proactively.

Abstract

Even today, over 70% of security vulnerabilities in critical software systems result from memory safety violations. To address this challenge, fuzzing and static analysis are widely used automated methods to discover such vulnerabilities. Fuzzing generates random program inputs to identify faults, while static analysis examines source code to detect potential vulnerabilities. Although these techniques share a common goal, they take fundamentally different approaches and have evolved largely independently. In this paper, we present an empirical analysis of five static analyzers and 13 fuzzers, applied to over 100 known security vulnerabilities in C/C++ programs. We measure the number of bug reports generated for each vulnerability to evaluate how the approaches differ and complement each other. Moreover, we randomly sample eight bug-containing functions, manually analyze all bug reports therein, and quantify false-positive rates. We also assess limits to bug discovery, ease of use, resource requirements, and integration into the development process. We find that both techniques discover different types of bugs, but there are clear winners for each. Developers should consider these tools depending on their specific workflow and usability requirements. Based on our findings, we propose future directions to foster collaboration between these research domains.

A Comparative Study of Fuzzers and Static Analysis Tools for Finding Memory Unsafety in C and C++

TL;DR

Memory-safety vulnerabilities in C/C++ remain a critical risk in modern software. This paper presents the first systematic cross-domain comparison of fuzzing and static analysis using the Magma benchmark (112 CVEs across seven programs) and 5 static analyzers plus 13 fuzzers, aiming to quantify true positives, overhead, and complementarity. Findings show fuzzers and static analyzers detect largely different bugs, with AFL++ and CodeQL leading their respective classes; combining them yields more comprehensive coverage, albeit with high manual effort for false positives and deduplication. The work provides practical guidance for maintainers, discusses integration into development workflows, and outlines future research directions to foster collaboration between fuzzing and static-analysis communities. Overall, the results suggest adopting a hybrid bug-finding strategy and progressing toward safer language designs to improve memory-safety proactively.

Abstract

Even today, over 70% of security vulnerabilities in critical software systems result from memory safety violations. To address this challenge, fuzzing and static analysis are widely used automated methods to discover such vulnerabilities. Fuzzing generates random program inputs to identify faults, while static analysis examines source code to detect potential vulnerabilities. Although these techniques share a common goal, they take fundamentally different approaches and have evolved largely independently. In this paper, we present an empirical analysis of five static analyzers and 13 fuzzers, applied to over 100 known security vulnerabilities in C/C++ programs. We measure the number of bug reports generated for each vulnerability to evaluate how the approaches differ and complement each other. Moreover, we randomly sample eight bug-containing functions, manually analyze all bug reports therein, and quantify false-positive rates. We also assess limits to bug discovery, ease of use, resource requirements, and integration into the development process. We find that both techniques discover different types of bugs, but there are clear winners for each. Developers should consider these tools depending on their specific workflow and usability requirements. Based on our findings, we propose future directions to foster collaboration between these research domains.

Paper Structure

This paper contains 26 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Flags per tool and bug class on Magma and CGC.
  • Figure 2: Intersection plot lex2014upset of Magma *cve found by each analyzer or fuzzer, as well as by both groups (Sum_*). Every row in the intersection stands for the set of bugs found by a tool, and every column is a set of bugs commonly detected by a group of tools. A color-filled circle indicates that a tool is part of a tool group. Bar plots show the respective set sizes for tools and tool groups.
  • Figure 3: Detected Magma *cve per tool across bug types. For fuzzers, a bug is found if it is detected once. Bug types (see \ref{['tbl:codenames']}) not covered by Magma are omitted.
  • Figure 4: The number of discovered CVEs / generated sast bug reports across vulnerability types. The color represents the ratio between reports and bugs. A reddish color indicates no true positives, while a blueish color indicates that true positives are found, where a darker blue indicates better ratios. Regarding CGC, Clang SA cannot be used due to the custom standard library, and also, there are no Logic bugs, hence we omit the corresponding column and row.
  • Figure 5: Projects using Fuzzing or SAST tools during CI.
  • ...and 3 more figures