Table of Contents
Fetching ...

Fuzzing at Scale: The Untold Story of the Scheduler

Ivica Nikolic, Racchit Jain

TL;DR

This work reframes fuzzing as a scale problem where coordinating the fuzzing of thousands of programs on limited CPU resources is governed by a scheduler. It introduces Boian, a dynamic, MAB-guided scheduler with non-stationary rewards and evolving exploration, which significantly improves total code coverage across a large program set and increases the number of programs benefiting from fuzzing. Through UbuntuBench (≈5,467 targets) and OSS-Fuzz benchmarks, Boian achieves up to ~30% gains over a naive baseline and shows that scheduler improvements can rival improvements from using stronger fuzzers. The authors construct a large Ubuntu-based benchmark and demonstrate that fuzzing at scale with sophisticated scheduling enables rapid bug discovery, exemplified by finding 4,908 bugs across 675 Ubuntu programs in a three-day run on 30 CPUs, highlighting the practical impact of scheduler-focused research in fuzzing ecosystems.

Abstract

How to search for bugs in 1,000 programs using a pre-existing fuzzer and a standard PC? We consider this problem and show that a well-designed strategy that determines which programs to fuzz and for how long can greatly impact the number of bugs found across the programs. In fact, the impact of employing an effective strategy is comparable to that of utilizing a state-of-the-art fuzzer. The considered problem is referred to as fuzzing at scale, and the strategy as scheduler. We show that besides a naive scheduler, that allocates equal fuzz time to all programs, we can consider dynamic schedulers that adjust time allocation based on the ongoing fuzzing progress of individual programs. Such schedulers are superior because they lead both to higher number of total found bugs and to higher number of found bugs for most programs. The performance gap between naive and dynamic schedulers can be as wide (or even wider) as the gap between two fuzzers. Our findings thus suggest that the problem of advancing schedulers is fundamental for fuzzing at scale. We develop several schedulers and leverage the most sophisticated one to fuzz simultaneously our newly compiled benchmark of around 5,000 Ubuntu programs, and detect 4908 bugs.

Fuzzing at Scale: The Untold Story of the Scheduler

TL;DR

This work reframes fuzzing as a scale problem where coordinating the fuzzing of thousands of programs on limited CPU resources is governed by a scheduler. It introduces Boian, a dynamic, MAB-guided scheduler with non-stationary rewards and evolving exploration, which significantly improves total code coverage across a large program set and increases the number of programs benefiting from fuzzing. Through UbuntuBench (≈5,467 targets) and OSS-Fuzz benchmarks, Boian achieves up to ~30% gains over a naive baseline and shows that scheduler improvements can rival improvements from using stronger fuzzers. The authors construct a large Ubuntu-based benchmark and demonstrate that fuzzing at scale with sophisticated scheduling enables rapid bug discovery, exemplified by finding 4,908 bugs across 675 Ubuntu programs in a three-day run on 30 CPUs, highlighting the practical impact of scheduler-focused research in fuzzing ecosystems.

Abstract

How to search for bugs in 1,000 programs using a pre-existing fuzzer and a standard PC? We consider this problem and show that a well-designed strategy that determines which programs to fuzz and for how long can greatly impact the number of bugs found across the programs. In fact, the impact of employing an effective strategy is comparable to that of utilizing a state-of-the-art fuzzer. The considered problem is referred to as fuzzing at scale, and the strategy as scheduler. We show that besides a naive scheduler, that allocates equal fuzz time to all programs, we can consider dynamic schedulers that adjust time allocation based on the ongoing fuzzing progress of individual programs. Such schedulers are superior because they lead both to higher number of total found bugs and to higher number of found bugs for most programs. The performance gap between naive and dynamic schedulers can be as wide (or even wider) as the gap between two fuzzers. Our findings thus suggest that the problem of advancing schedulers is fundamental for fuzzing at scale. We develop several schedulers and leverage the most sophisticated one to fuzz simultaneously our newly compiled benchmark of around 5,000 Ubuntu programs, and detect 4908 bugs.

Paper Structure

This paper contains 14 sections, 1 equation, 2 figures, 6 tables, 2 algorithms.

Figures (2)

  • Figure 1: The total amount of code coverage over time obtained by fuzzing 1,000 programs with AFL++ fuzzer.
  • Figure 2: The total number of found bugs in UbuntuBench during three-day fuzzing at scale campaign with AFL++ and Honggfuzz, each running on 30 CPUs.