Table of Contents
Fetching ...

Program Environment Fuzzing

Ruijie Meng, Gregory J. Duck, Abhik Roychoudhury

TL;DR

The paper addresses the challenge that software behavior is driven by complex execution environments and that conventional fuzzing often relies on manually modeled environments. It proposes EnvFuzz, a greybox fuzzing framework that records environment interactions at the kernel/user boundary and replays them with targeted mutations, thereby fuzzing the full program environment without explicit environment models. The approach combines a two-phase workflow (record and replay with a tree-based fuzzing search and feedback from branch and state coverage) with a relaxed replay mechanism to handle divergence, supported by a full C++ implementation. Empirically, EnvFuzz discovers 33 previously unknown bugs (16 CVEs) across 20 real-world subjects, outperforms baselines on network protocols in code coverage and throughput, and demonstrates robust, low-effort practicality for Linux user-space applications.

Abstract

Computer programs are not executed in isolation, but rather interact with the execution environment which drives the program behaviors. Software validation methods thus need to capture the effect of possibly complex environmental interactions. Program environments may come from files, databases, configurations, network sockets, human-user interactions, and more. Conventional approaches for environment capture in symbolic execution and model checking employ environment modeling, which involves manual effort. In this paper, we take a different approach based on an extension of greybox fuzzing. Given a program, we first record all observed environmental interactions at the kernel/user-mode boundary in the form of system calls. Next, we replay the program under the original recorded interactions, but this time with selective mutations applied, in order to get the effect of different program environments -- all without environment modeling. Via repeated (feedback-driven) mutations over a fuzzing campaign, we can search for program environments that induce crashing behaviors. Our EnvFuzz tool found 33 previously unknown bugs in well-known real-world protocol implementations and GUI applications. Many of these are security vulnerabilities and 16 CVEs were assigned.

Program Environment Fuzzing

TL;DR

The paper addresses the challenge that software behavior is driven by complex execution environments and that conventional fuzzing often relies on manually modeled environments. It proposes EnvFuzz, a greybox fuzzing framework that records environment interactions at the kernel/user boundary and replays them with targeted mutations, thereby fuzzing the full program environment without explicit environment models. The approach combines a two-phase workflow (record and replay with a tree-based fuzzing search and feedback from branch and state coverage) with a relaxed replay mechanism to handle divergence, supported by a full C++ implementation. Empirically, EnvFuzz discovers 33 previously unknown bugs (16 CVEs) across 20 real-world subjects, outperforms baselines on network protocols in code coverage and throughput, and demonstrates robust, low-effort practicality for Linux user-space applications.

Abstract

Computer programs are not executed in isolation, but rather interact with the execution environment which drives the program behaviors. Software validation methods thus need to capture the effect of possibly complex environmental interactions. Program environments may come from files, databases, configurations, network sockets, human-user interactions, and more. Conventional approaches for environment capture in symbolic execution and model checking employ environment modeling, which involves manual effort. In this paper, we take a different approach based on an extension of greybox fuzzing. Given a program, we first record all observed environmental interactions at the kernel/user-mode boundary in the form of system calls. Next, we replay the program under the original recorded interactions, but this time with selective mutations applied, in order to get the effect of different program environments -- all without environment modeling. Via repeated (feedback-driven) mutations over a fuzzing campaign, we can search for program environments that induce crashing behaviors. Our EnvFuzz tool found 33 previously unknown bugs in well-known real-world protocol implementations and GUI applications. Many of these are security vulnerabilities and 16 CVEs were assigned.
Paper Structure (24 sections, 5 figures, 7 tables, 2 algorithms)

This paper contains 24 sections, 5 figures, 7 tables, 2 algorithms.

Figures (5)

  • Figure 1: (a) is a calculator application with the full environment, including regular file I/O, standard streams, and socket/event fds to various system services. (b) is a simplified environment with a single input/output (windowing system socket), where all other interactions are not captured.
  • Figure 2: Overview of Program Environment Fuzzer $\mathcal{E}$fuzz.
  • Figure 3: Illustration of the underlying fuzzing algorithm. Here, the example program reads from file descriptor 0, then interacts with socket (file descriptor 3). The fuzzer faithfully replays a previously recorded interaction ⓪, as well as several mutant interactions ①/②/③/④/⑤/⑥. Each mutant interaction is generated by mutating at least one input system call from the faithful replay. This causes the program's behavior to diverge, including exit with error ②/③, system call reordering ①/⑥, new I/O system call ④, and a crash ⑤. The program state $\{\texttt{INIT},\texttt{READY},\texttt{DISPLAY},\texttt{CLOSING}\}$ between select system calls is also illustrated.
  • Figure 4: Illustration of the global ordering ($\sigma$) for faithful replay and a local ordering ($Q$) for relaxed replay. The relaxed replay partitions $\sigma$ into a set of miniqueues ($Q[\mathit{fd}]$) indexed by the file descriptor, each of which defines a local ordering specific to each $\mathit{fd}$.
  • Figure 5: Code covered over time by AFLNet, Nyx-Net and $\mathcal{E}$fuzz across 10 runs of 24 hours on ProFuzzBench subjects.