Table of Contents
Fetching ...

Fuzzing MLIR Compilers with Custom Mutation Synthesis

Ben Limpanukorn, Jiyuan Wang, Hong Jin Kang, Eric Zitong Zhou, Miryung Kim

TL;DR

A new test generator called SynthFuzz is designed that combines grammar-based fuzzing with custom mutation synthesis and obviates the need to manually define custom mutation operators for each dialect.

Abstract

Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level Intermediate Representation (MLIR) to facilitate more efficient development. With MLIR, compiler developers can easily define their own custom IRs in the form of MLIR dialects. However, the diversity and rapid evolution of such custom IRs make it impractical to manually write a custom test generator for each dialect. To address this problem, we design a new test generator called SYNTHFUZZ that combines grammar-based fuzzing with custom mutation synthesis. The key essence of SYNTHFUZZ is two fold: (1) It automatically infers parameterized context-dependent custom mutations from existing test cases. (2) It then concretizes the mutation's content depending on the target context and reduces the chance of inserting invalid edits by performing k-ancestor and pre(post)fix matching. SYNTHFUZZ obviates the need to manually define custom mutation operators for each dialect. We compare SYNTHFUZZ to three baselines: Grammarinator, MLIRSmith, and NeuRI. We conduct this comprehensive comparison on four different MLIR projects. Each project defines a new set of MLIR dialects where manually writing a custom test generator would take weeks of effort. Our evaluation shows that SYNTHFUZZ on average improves MLIR dialect pair coverage by 1.75 times, which increases branch coverage by 1.22 times. Further, we show that our context dependent custom mutation increases the proportion of valid tests by up to 1.11 times, indicating that SYNTHFUZZ correctly concretizes its parameterized mutations with respect to the target context. Parameterization of the mutations reduces the fraction of tests violating the base MLIR constraints by 0.57 times, increasing the time spent fuzzing dialect-specific code.

Fuzzing MLIR Compilers with Custom Mutation Synthesis

TL;DR

A new test generator called SynthFuzz is designed that combines grammar-based fuzzing with custom mutation synthesis and obviates the need to manually define custom mutation operators for each dialect.

Abstract

Compiler technologies in deep learning and domain-specific hardware acceleration are increasingly adopting extensible compiler frameworks such as Multi-Level Intermediate Representation (MLIR) to facilitate more efficient development. With MLIR, compiler developers can easily define their own custom IRs in the form of MLIR dialects. However, the diversity and rapid evolution of such custom IRs make it impractical to manually write a custom test generator for each dialect. To address this problem, we design a new test generator called SYNTHFUZZ that combines grammar-based fuzzing with custom mutation synthesis. The key essence of SYNTHFUZZ is two fold: (1) It automatically infers parameterized context-dependent custom mutations from existing test cases. (2) It then concretizes the mutation's content depending on the target context and reduces the chance of inserting invalid edits by performing k-ancestor and pre(post)fix matching. SYNTHFUZZ obviates the need to manually define custom mutation operators for each dialect. We compare SYNTHFUZZ to three baselines: Grammarinator, MLIRSmith, and NeuRI. We conduct this comprehensive comparison on four different MLIR projects. Each project defines a new set of MLIR dialects where manually writing a custom test generator would take weeks of effort. Our evaluation shows that SYNTHFUZZ on average improves MLIR dialect pair coverage by 1.75 times, which increases branch coverage by 1.22 times. Further, we show that our context dependent custom mutation increases the proportion of valid tests by up to 1.11 times, indicating that SYNTHFUZZ correctly concretizes its parameterized mutations with respect to the target context. Parameterization of the mutations reduces the fraction of tests violating the base MLIR constraints by 0.57 times, increasing the time spent fuzzing dialect-specific code.
Paper Structure (28 sections, 5 figures, 4 tables, 1 algorithm)

This paper contains 28 sections, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: A flowchart of SynthFuzz's fuzzing loop
  • Figure 2: This diagram illustrates how $\mathop{\mathrm{SYNTH}}\nolimits$ decomposes the donor test case $P_d$ shown in Listing \ref{['lst:synth-example']} into a parameterized context and parameterized mutation. For example, concrete symbols such %arg0, %c1, i2, and %o1 are now paramterized as placeholders such as A, B, C, and D respectively.
  • Figure 3: Illustration of $k$-ancestor and $l(r)$-sibling context matching. Location B is invalid due to not matching the postfix context with $r=1$. Location C is invalid due to not matching the $k$-ancestor path context as the parent node is an operation, not a block with $k=2$.
  • Figure 4: An illustration of the $\mathop{\mathrm{MATCH}}\nolimits$ step. When the recipient test case in Listing \ref{['lst:match-example']} is matched with the parameterized context shown in this figure, the parameters A, B, C, and D are bound to the concrete values %arg0, %0, %i4, and %1 respectively.
  • Figure 5: Branch coverage for each subject program. SynthFuzz outperforms a grammar-based fuzzer by up to 1.51$\times$ and improves coverage by up to 1.47$\times$ compared to existing seed tests.