Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection

Tatsuya Shirai; Olivier Nourry; Yutaro Kashiwa; Kenji Fujiwara; Hajimu Iida

Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection

Tatsuya Shirai, Olivier Nourry, Yutaro Kashiwa, Kenji Fujiwara, Hajimu Iida

TL;DR

This study empirically investigates whether programming language affects fuzzing bug detection by analyzing 61,444 fuzzing bugs and 999,248 OSS-Fuzz builds across 559 projects, categorized by primary language. It uses four research questions to compare bug frequency, vulnerability patterns, reproducibility, and fuzzing efficiency among languages, revealing language-specific differences in crash types, severity distributions, and detection dynamics. Key findings include higher detection frequencies in C++ and Rust, a low vulnerability ratio but concentration of high-severity issues in Rust and Python, language-dependent reproducibility (Rust near 99% vs Go around 50%), and varying patch coverage and time-to-detection across languages. The results support language-aware fuzzing strategies and tool development, suggesting fuzzing platforms should provide language-tuned or directed fuzzing approaches to address distinct language characteristics and security profiles.

Abstract

Fuzzing has become a popular technique for automatically detecting vulnerabilities and bugs by generating unexpected inputs. In recent years, the fuzzing process has been integrated into continuous integration workflows (i.e., continuous fuzzing), enabling short and frequent testing cycles. Despite its widespread adoption, prior research has not examined whether the effectiveness of continuous fuzzing varies across programming languages. This study conducts a large-scale cross-language analysis to examine how fuzzing bug characteristics and detection efficiency differ among languages. We analyze 61,444 fuzzing bugs and 999,248 builds from 559 OSS-Fuzz projects categorized by primary language. Our findings reveal that (i) C++ and Rust exhibit higher fuzzing bug detection frequencies, (ii) Rust and Python show low vulnerability ratios but tend to expose more critical vulnerabilities, (iii) crash types vary across languages and unreproducible bugs are more frequent in Go but rare in Rust, and (iv) Python attains higher patch coverage but suffers from longer time-to-detection. These results demonstrate that fuzzing behavior and effectiveness are strongly shaped by language design, providing insights for language-aware fuzzing strategies and tool development.

Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection

TL;DR

Abstract

Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (9)