Table of Contents
Fetching ...

WGAN-AFL: Seed Generation Augmented Fuzzer with Wasserstein-GAN

Liqun Yang, Chunan Li, Yongxin Qiu, Chaoren Wei, Jian Yang, Hongcheng Guo, Jinxin Ma, Zhoujun Li

TL;DR

WGAN-AFL tackles AFL’s seed-sensitivity by learning the distribution of high-quality seeds from existing testcases using a Wasserstein GAN. The framework adds a Dataset Processing Module to curate seeds, a Model Training stage with fully connected generator/critic networks under weight clipping and RMSProp, and a Fuzzing Module that supplies AFL with diverse, high-quality seeds. Empirical results on Linux Binutils show WGAN-AFL achieves higher code coverage (average ~33.85%, +23.8% vs AFL), more new paths, and significantly more vulnerabilities (e.g., objdump: 274, nm: 21) than AFL and GAN-AFL, validating the seed-quality hypothesis. This approach demonstrates a practical seed-augmentation strategy that enhances fuzzing efficiency and vulnerability discovery in real-world software testing.

Abstract

The importance of addressing security vulnerabilities is indisputable, with software becoming crucial in sectors such as national defense and finance. Consequently, The security issues caused by software vulnerabilities cannot be ignored. Fuzz testing is an automated software testing technology that can detect vulnerabilities in the software. However, most previous fuzzers encounter challenges that fuzzing performance is sensitive to initial input seeds. In the absence of high-quality initial input seeds, fuzzers may expend significant resources on program path exploration, leading to a substantial decrease in the efficiency of vulnerability detection. To address this issue, we propose WGAN-AFL. By collecting high-quality testcases, we train a generative adversarial network (GAN) to learn their features, thereby obtaining high-quality initial input seeds. To overcome drawbacks like mode collapse and training instability inherent in GANs, we utilize the Wasserstein GAN (WGAN) architecture for training, further enhancing the quality of the generated seeds. Experimental results demonstrate that WGAN-AFL significantly outperforms the original AFL in terms of code coverage, new paths, and vulnerability discovery, demonstrating the effective enhancement of seed quality by WGAN-AFL.

WGAN-AFL: Seed Generation Augmented Fuzzer with Wasserstein-GAN

TL;DR

WGAN-AFL tackles AFL’s seed-sensitivity by learning the distribution of high-quality seeds from existing testcases using a Wasserstein GAN. The framework adds a Dataset Processing Module to curate seeds, a Model Training stage with fully connected generator/critic networks under weight clipping and RMSProp, and a Fuzzing Module that supplies AFL with diverse, high-quality seeds. Empirical results on Linux Binutils show WGAN-AFL achieves higher code coverage (average ~33.85%, +23.8% vs AFL), more new paths, and significantly more vulnerabilities (e.g., objdump: 274, nm: 21) than AFL and GAN-AFL, validating the seed-quality hypothesis. This approach demonstrates a practical seed-augmentation strategy that enhances fuzzing efficiency and vulnerability discovery in real-world software testing.

Abstract

The importance of addressing security vulnerabilities is indisputable, with software becoming crucial in sectors such as national defense and finance. Consequently, The security issues caused by software vulnerabilities cannot be ignored. Fuzz testing is an automated software testing technology that can detect vulnerabilities in the software. However, most previous fuzzers encounter challenges that fuzzing performance is sensitive to initial input seeds. In the absence of high-quality initial input seeds, fuzzers may expend significant resources on program path exploration, leading to a substantial decrease in the efficiency of vulnerability detection. To address this issue, we propose WGAN-AFL. By collecting high-quality testcases, we train a generative adversarial network (GAN) to learn their features, thereby obtaining high-quality initial input seeds. To overcome drawbacks like mode collapse and training instability inherent in GANs, we utilize the Wasserstein GAN (WGAN) architecture for training, further enhancing the quality of the generated seeds. Experimental results demonstrate that WGAN-AFL significantly outperforms the original AFL in terms of code coverage, new paths, and vulnerability discovery, demonstrating the effective enhancement of seed quality by WGAN-AFL.
Paper Structure (21 sections, 8 equations, 5 figures, 4 tables)

This paper contains 21 sections, 8 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: AFL workflow.
  • Figure 2: Working process of wgan. DBLP:conf/icml/ArjovskyCB17
  • Figure 3: WGAN-AFL framework.
  • Figure 4: Variation of Loss Values during Model Training: Training Loss of GAN(Left) and Training Loss of WGAN (Right).
  • Figure 5: Training Time of GAN and WGAN.