Harnessing Large Language Models for Seed Generation in Greybox Fuzzing

Wenxuan Shi; Yunhang Zhang; Xinyu Xing; Jun Xu

Harnessing Large Language Models for Seed Generation in Greybox Fuzzing

Wenxuan Shi, Yunhang Zhang, Xinyu Xing, Jun Xu

TL;DR

SeedMind addresses the challenge of generating high-quality seeds for greybox fuzzing, especially for non-standard input formats. It uses LLMs to produce seed-generating programs rather than direct seeds, integrating a coverage-guided, iterative refinement loop and a state-driven realignment mechanism to handle context limits and model reliability. The approach yields seed quality close to human-generated seeds and consistently outperforms prior LLM-based baselines across OSS-Fuzz and MAGMA benchmarks, using multiple models with practical cost. The work demonstrates the practicality and generality of LLM-assisted seed generation for diverse software targets and input modalities, facilitating more effective and scalable fuzzing.

Abstract

Greybox fuzzing has emerged as a preferred technique for discovering software bugs, striking a balance between efficiency and depth of exploration. While research has focused on improving fuzzing techniques, the importance of high-quality initial seeds remains critical yet often overlooked. Existing methods for seed generation are limited, especially for programs with non-standard or custom input formats. Large Language Models (LLMs) has revolutionized numerous domains, showcasing unprecedented capabilities in understanding and generating complex patterns across various fields of knowledge. This paper introduces SeedMind, a novel system that leverages LLMs to boost greybox fuzzing through intelligent seed generation. Unlike previous approaches, SeedMind employs LLMs to create test case generators rather than directly producing test cases. Our approach implements an iterative, feedback-driven process that guides the LLM to progressively refine test case generation, aiming for increased code coverage depth and breadth. In developing SeedMind, we addressed key challenges including input format limitations, context window constraints, and ensuring consistent, progress-aware behavior. Intensive evaluations with real-world applications show that SeedMind effectively harnesses LLMs to generate high-quality test cases and facilitate fuzzing in bug finding, presenting utility comparable to human-created seeds and significantly outperforming the existing LLM-based solutions.

Harnessing Large Language Models for Seed Generation in Greybox Fuzzing

TL;DR

Abstract

Harnessing Large Language Models for Seed Generation in Greybox Fuzzing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)