Table of Contents
Fetching ...

InsightQL: Advancing Human-Assisted Fuzzing with a Unified Code Database and Parameterized Query Interface

Wentao Gao, Renata Borovica-Gajic, Sang Kil Cha, Tian Qiu, Van-Thuan Pham

TL;DR

The paper addresses coverage plateaus in coverage-guided fuzzing by introducing InsightQL, a human-assisted framework that unifies static code data with dynamic fuzzing data in a Kimball-style data warehouse built atop CodeQL. It provides a parameterized query interface and VS Code integration to help analysts identify and resolve fuzz blockers efficiently, demonstrated on 14 real-world libraries with notable coverage improvements (up to 13.90%). The contributions include a unified code database design, a set of extensible queries for taint and blocker analysis, and a Python-based automation framework, enabling scalable, user-friendly analysis of fuzz campaigns. The work demonstrates the practical viability of human-in-the-loop fuzzing tooling, showing meaningful improvements in code coverage and blocker resolution, and outlines a path for expanding blocker taxonomy and automation in future research.

Abstract

Fuzzing is a highly effective automated testing method for uncovering software vulnerabilities. Despite advances in fuzzing techniques, such as coverage-guided greybox fuzzing, many fuzzers struggle with coverage plateaus caused by fuzz blockers, limiting their ability to find deeper vulnerabilities. Human expertise can address these challenges, but analyzing fuzzing results to guide this support remains labor-intensive. To tackle this, we introduce InsightQL, the first human-assisting framework for fuzz blocker analysis. Powered by a unified database and an intuitive parameterized query interface, InsightQL aids developers in systematically extracting insights and efficiently unblocking fuzz blockers. Our experiments on 14 popular real-world libraries from the FuzzBench benchmark demonstrate the effectiveness of InsightQL, leading to the unblocking of many fuzz blockers and considerable improvements in code coverage (up to 13.90%).

InsightQL: Advancing Human-Assisted Fuzzing with a Unified Code Database and Parameterized Query Interface

TL;DR

The paper addresses coverage plateaus in coverage-guided fuzzing by introducing InsightQL, a human-assisted framework that unifies static code data with dynamic fuzzing data in a Kimball-style data warehouse built atop CodeQL. It provides a parameterized query interface and VS Code integration to help analysts identify and resolve fuzz blockers efficiently, demonstrated on 14 real-world libraries with notable coverage improvements (up to 13.90%). The contributions include a unified code database design, a set of extensible queries for taint and blocker analysis, and a Python-based automation framework, enabling scalable, user-friendly analysis of fuzz campaigns. The work demonstrates the practical viability of human-in-the-loop fuzzing tooling, showing meaningful improvements in code coverage and blocker resolution, and outlines a path for expanding blocker taxonomy and automation in future research.

Abstract

Fuzzing is a highly effective automated testing method for uncovering software vulnerabilities. Despite advances in fuzzing techniques, such as coverage-guided greybox fuzzing, many fuzzers struggle with coverage plateaus caused by fuzz blockers, limiting their ability to find deeper vulnerabilities. Human expertise can address these challenges, but analyzing fuzzing results to guide this support remains labor-intensive. To tackle this, we introduce InsightQL, the first human-assisting framework for fuzz blocker analysis. Powered by a unified database and an intuitive parameterized query interface, InsightQL aids developers in systematically extracting insights and efficiently unblocking fuzz blockers. Our experiments on 14 popular real-world libraries from the FuzzBench benchmark demonstrate the effectiveness of InsightQL, leading to the unblocking of many fuzz blockers and considerable improvements in code coverage (up to 13.90%).

Paper Structure

This paper contains 25 sections, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: A 3-phase workflow of using code coverage-guided greybox fuzzing for vulnerability discovery
  • Figure 2: An in-progress classification of fuzz blockers gao2023beyond
  • Figure 3: Multi-layer design of InsightQL. InsightQL is built on top of CodeQL, following the Kimball's Data Warehouse model. New components are highlighted in orange.
  • Figure 4: Comparison of time spent on each task between InsightQL users and InsightQL developers