Table of Contents
Fetching ...

Locus: Agentic Predicate Synthesis for Directed Fuzzing

Jie Zhu, Chihao Shen, Ziyang Li, Jiahao Yu, Yizheng Chen, Kexin Pei

TL;DR

The paper tackles the inefficiency of directed fuzzing caused by sparse, hard-to-reach target states and manual constraint design. It introduces Locus, an agentic framework that synthesizes semantically meaningful progress predicates and instruments them into programs offline, with strict validation to ensure relaxation of target states while preserving fuzzing behavior. Through extensive evaluation on the Magma benchmark across eight fuzzers, Locus achieves substantial speedups and uncovers nine previously unpatched vulnerabilities, three of which have patches drafted. The work demonstrates the potential of combining LLM-driven reasoning with formal verification to generalize directed fuzzing across diverse targets and languages, enabling more reliable and scalable vulnerability discovery.

Abstract

Directed fuzzing aims to find program inputs that lead to specified target program states. It has broad applications, such as debugging system crashes, confirming reported bugs, and generating exploits for potential vulnerabilities. This task is inherently challenging because target states are often deeply nested in the program, while the search space manifested by numerous possible program inputs is prohibitively large. Existing approaches rely on branch distances or manually-specified constraints to guide the search; however, the branches alone are often insufficient to precisely characterize progress toward reaching the target states, while the manually specified constraints are often tailored for specific bug types and thus difficult to generalize to diverse target states and programs. We present Locus, a novel framework to improve the efficiency of directed fuzzing. Our key insight is to synthesize predicates to capture fuzzing progress as semantically meaningful intermediate states, serving as milestones towards reaching the target states. When used to instrument the program under fuzzing, they can reject executions unlikely to reach the target states, while providing additional coverage guidance. To automate this task and generalize to diverse programs, Locus features an agentic framework with program analysis tools to synthesize and iteratively refine the candidate predicates, while ensuring the predicates strictly relax the target states to prevent false rejections via symbolic execution. Our evaluation shows that Locus substantially improves the efficiency of eight state-of-the-art fuzzers in discovering real-world vulnerabilities, achieving an average speedup of 41.6x. So far, Locus has found nine previously unpatched bugs, with three already acknowledged with draft patches.

Locus: Agentic Predicate Synthesis for Directed Fuzzing

TL;DR

The paper tackles the inefficiency of directed fuzzing caused by sparse, hard-to-reach target states and manual constraint design. It introduces Locus, an agentic framework that synthesizes semantically meaningful progress predicates and instruments them into programs offline, with strict validation to ensure relaxation of target states while preserving fuzzing behavior. Through extensive evaluation on the Magma benchmark across eight fuzzers, Locus achieves substantial speedups and uncovers nine previously unpatched vulnerabilities, three of which have patches drafted. The work demonstrates the potential of combining LLM-driven reasoning with formal verification to generalize directed fuzzing across diverse targets and languages, enabling more reliable and scalable vulnerability discovery.

Abstract

Directed fuzzing aims to find program inputs that lead to specified target program states. It has broad applications, such as debugging system crashes, confirming reported bugs, and generating exploits for potential vulnerabilities. This task is inherently challenging because target states are often deeply nested in the program, while the search space manifested by numerous possible program inputs is prohibitively large. Existing approaches rely on branch distances or manually-specified constraints to guide the search; however, the branches alone are often insufficient to precisely characterize progress toward reaching the target states, while the manually specified constraints are often tailored for specific bug types and thus difficult to generalize to diverse target states and programs. We present Locus, a novel framework to improve the efficiency of directed fuzzing. Our key insight is to synthesize predicates to capture fuzzing progress as semantically meaningful intermediate states, serving as milestones towards reaching the target states. When used to instrument the program under fuzzing, they can reject executions unlikely to reach the target states, while providing additional coverage guidance. To automate this task and generalize to diverse programs, Locus features an agentic framework with program analysis tools to synthesize and iteratively refine the candidate predicates, while ensuring the predicates strictly relax the target states to prevent false rejections via symbolic execution. Our evaluation shows that Locus substantially improves the efficiency of eight state-of-the-art fuzzers in discovering real-world vulnerabilities, achieving an average speedup of 41.6x. So far, Locus has found nine previously unpatched bugs, with three already acknowledged with draft patches.

Paper Structure

This paper contains 19 sections, 2 theorems, 4 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

The instrumented program $P'$ is fuzzing admissible to $P$, if $P'$ is instrumented with $\Phi$, where every $\phi \in \Phi$ is the relaxation of $\psi$.

Figures (4)

  • Figure 1: A motivating example (CVE-2013-6954) showing how Locus complements existing works. (a) Traditional approaches based on distance to targets in CFG lack fine-grained guidance to distinguish nodes when they have the same distance. (b) LLM-based harness generation is limited to help reach the target. (c) Predicates (as if statements) synthesized by Locus provide extra semantic guidance for DGFs, while relaxing the constraint generation from input-level to arbitrary program points.
  • Figure 2: Overview of Locus workflow. Locus takes as inputs the program codebase $P$ and the canary $\psi$, and produces a program $P'$ instrumented with the progress-capturing predicates. The predicate branches provide extra coverage feedback and guards (via early termination) to guide the fuzzer toward reaching the target state, i.e., canary $\psi$, more efficiently.
  • Figure 3: Locus generates a more precise canary using only the security patch. Previously, an incorrectly set dequoting flag enabled access to uninitialized memory.
  • Figure 4: A previously unknown vulnerability in libarchive

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Theorem 1
  • Theorem 2