Table of Contents
Fetching ...

ReFuzz: Reusing Tests for Processor Fuzzing with Contextual Bandits

Chen Chen, Zaiyan Xu, Mohamadreza Rostami, David Liu, Dileep Kalathil, Ahmad-Reza Sadeghi, Jeyavijayan Rajendran

TL;DR

ReFuzz introduces a contextual-bandit driven framework that reuses effective tests from prior processors to seed fuzzing on a PUT. By adaptively selecting and mutating PP-tests rather than fuzzing from scratch, it achieves large coverage-speedups and higher total coverage, while enabling detection of vulnerabilities and variant bugs propagated through design reuse. The approach includes an adaptive test minimization strategy to prune redundant tests and robust integration with existing fuzzers, demonstrating strong cross-processor generalization on open-source RISC-V designs. Empirical results show multiple new vulnerabilities and functional bugs discovered, with ReFuzz achieving on average over 500x coverage-speedups and notable improvements in detection speed compared to state-of-the-art baselines. This work offers a practical, industry-compatible pathway to more efficient pre-silicon verification and vulnerability discovery across generations of processor designs.

Abstract

Processor designs rely on iterative modifications and reuse well-established designs. However, this reuse of prior designs also leads to similar vulnerabilities across multiple processors. As processors grow increasingly complex with iterative modifications, efficiently detecting vulnerabilities from modern processors is critical. Inspired by software fuzzing, hardware fuzzing has recently demonstrated its effectiveness in detecting processor vulnerabilities. Yet, to our best knowledge, existing processor fuzzers fuzz each design individually, lacking the capability to understand known vulnerabilities in prior processors to fine-tune fuzzing to identify similar or new variants of vulnerabilities. To address this gap, we present ReFuzz, an adaptive fuzzing framework that leverages contextual bandit to reuse highly effective tests from prior processors to fuzz a processor-under-test (PUT) within a given ISA. By intelligently mutating tests that trigger vulnerabilities in prior processors, ReFuzz effectively detects similar and new variants of vulnerabilities in PUTs. ReFuzz uncovered three new security vulnerabilities and two new functional bugs. ReFuzz detected one vulnerability by reusing a test that triggers a known vulnerability in a prior processor. One functional bug exists across three processors that share design modules. The second bug has two variants. Additionally, ReFuzz reuses highly effective tests to enhance efficiency in coverage, achieving an average 511.23x coverage speedup and up to 9.33% more total coverage, compared to existing fuzzers.

ReFuzz: Reusing Tests for Processor Fuzzing with Contextual Bandits

TL;DR

ReFuzz introduces a contextual-bandit driven framework that reuses effective tests from prior processors to seed fuzzing on a PUT. By adaptively selecting and mutating PP-tests rather than fuzzing from scratch, it achieves large coverage-speedups and higher total coverage, while enabling detection of vulnerabilities and variant bugs propagated through design reuse. The approach includes an adaptive test minimization strategy to prune redundant tests and robust integration with existing fuzzers, demonstrating strong cross-processor generalization on open-source RISC-V designs. Empirical results show multiple new vulnerabilities and functional bugs discovered, with ReFuzz achieving on average over 500x coverage-speedups and notable improvements in detection speed compared to state-of-the-art baselines. This work offers a practical, industry-compatible pathway to more efficient pre-silicon verification and vulnerability discovery across generations of processor designs.

Abstract

Processor designs rely on iterative modifications and reuse well-established designs. However, this reuse of prior designs also leads to similar vulnerabilities across multiple processors. As processors grow increasingly complex with iterative modifications, efficiently detecting vulnerabilities from modern processors is critical. Inspired by software fuzzing, hardware fuzzing has recently demonstrated its effectiveness in detecting processor vulnerabilities. Yet, to our best knowledge, existing processor fuzzers fuzz each design individually, lacking the capability to understand known vulnerabilities in prior processors to fine-tune fuzzing to identify similar or new variants of vulnerabilities. To address this gap, we present ReFuzz, an adaptive fuzzing framework that leverages contextual bandit to reuse highly effective tests from prior processors to fuzz a processor-under-test (PUT) within a given ISA. By intelligently mutating tests that trigger vulnerabilities in prior processors, ReFuzz effectively detects similar and new variants of vulnerabilities in PUTs. ReFuzz uncovered three new security vulnerabilities and two new functional bugs. ReFuzz detected one vulnerability by reusing a test that triggers a known vulnerability in a prior processor. One functional bug exists across three processors that share design modules. The second bug has two variants. Additionally, ReFuzz reuses highly effective tests to enhance efficiency in coverage, achieving an average 511.23x coverage speedup and up to 9.33% more total coverage, compared to existing fuzzers.

Paper Structure

This paper contains 34 sections, 1 equation, 9 figures, 3 tables, 3 algorithms.

Figures (9)

  • Figure 1: ReFuzz, a novel fuzzing framework that leverages effective tests from prior processors to enhance fuzzing efficiency on processor-under-tests (PUTs). Vuls. means Vulnerabilities.
  • Figure 2: BOOMV3 and BOOMV4 have the same bug due to reusing modules that update the minstret register two cycles after an instruction is committed. BOOMV4 has more variants of the bug due to its new microarchitectures. For example, the MUL instruction will trigger the bug in BOOMV4 but not in BOOMV3. Red lines highlight the clock cycles when minstret is accessed to represent architectural states, while dashed arrows point to the actual commit number for each instruction.
  • Figure 3: ReFuzz's training stage. $C$ is the set of different coverage contexts. $Vul_t$ is a test in the vulnerability list, and $Cov_t$ is a test in the coverage list.
  • Figure 4: The total coverage achieved by the baseline fuzzer, original CB, and adaptive CB on BOOMV4boom.
  • Figure 5: The framework of ReFuzz.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3