Table of Contents
Fetching ...

BugLens: Leveraging Bisection for Lightweight Compiler Bug Deduplication

Xintong Zhou, Zhenyang Xu, Yongqiang Tian, Chengnian Sun

TL;DR

This paper tackles the bug deduplication problem in compiler testing by evaluating whether bisection, a simple debugging technique, can effectively identify unique miscompilation bugs exposed by random testing. It introduces BugLens, a bisection-based deduplication approach that uses failure-inducing commits as the primary criterion and augments this with bug-triggering optimizations to mitigate false negatives. Empirical results across four real-world GCC/LLVM datasets show BugLens significantly reduces human effort compared with state-of-the-art analysis-based methods (Tamer and D3), while maintaining strong generality and practical efficiency. The work demonstrates that a lightweight, generalizable strategy can outperform more complex, tool-heavy approaches, highlighting the value of re-evaluating simple techniques for real-world compiler debugging tasks.

Abstract

Random testing has proven to be an effective technique for compiler validation. However, the debugging of bugs identified through random testing presents a significant challenge due to the frequent occurrence of duplicate test programs that expose identical compiler bugs. The process to identify duplicates is a practical research problem known as bug deduplication. Prior methodologies for compiler bug deduplication primarily rely on program analysis to extract bug-related features for duplicate identification, which can result in substantial computational overhead and limited generalizability. This paper investigates the feasibility of employing bisection, a standard debugging procedure largely overlooked in prior research on compiler bug deduplication, for this purpose. Our study demonstrates that the utilization of bisection to locate failure-inducing commits provides a valuable criterion for deduplication, albeit one that requires supplementary techniques for more accurate identification. Building on these results, we introduce BugLens, a novel deduplication method that primarily uses bisection, enhanced by the identification of bug-triggering optimizations to minimize false negatives. Empirical evaluations conducted on four real-world datasets demonstrate that BugLens significantly outperforms the state-of-the-art analysis-based methodologies Tamer and D3 by saving an average of 26.98% and 9.64% human effort to identify the same number of distinct bugs. Given the inherent simplicity and generalizability of bisection, it presents a highly practical solution for compiler bug deduplication in real-world applications.

BugLens: Leveraging Bisection for Lightweight Compiler Bug Deduplication

TL;DR

This paper tackles the bug deduplication problem in compiler testing by evaluating whether bisection, a simple debugging technique, can effectively identify unique miscompilation bugs exposed by random testing. It introduces BugLens, a bisection-based deduplication approach that uses failure-inducing commits as the primary criterion and augments this with bug-triggering optimizations to mitigate false negatives. Empirical results across four real-world GCC/LLVM datasets show BugLens significantly reduces human effort compared with state-of-the-art analysis-based methods (Tamer and D3), while maintaining strong generality and practical efficiency. The work demonstrates that a lightweight, generalizable strategy can outperform more complex, tool-heavy approaches, highlighting the value of re-evaluating simple techniques for real-world compiler debugging tasks.

Abstract

Random testing has proven to be an effective technique for compiler validation. However, the debugging of bugs identified through random testing presents a significant challenge due to the frequent occurrence of duplicate test programs that expose identical compiler bugs. The process to identify duplicates is a practical research problem known as bug deduplication. Prior methodologies for compiler bug deduplication primarily rely on program analysis to extract bug-related features for duplicate identification, which can result in substantial computational overhead and limited generalizability. This paper investigates the feasibility of employing bisection, a standard debugging procedure largely overlooked in prior research on compiler bug deduplication, for this purpose. Our study demonstrates that the utilization of bisection to locate failure-inducing commits provides a valuable criterion for deduplication, albeit one that requires supplementary techniques for more accurate identification. Building on these results, we introduce BugLens, a novel deduplication method that primarily uses bisection, enhanced by the identification of bug-triggering optimizations to minimize false negatives. Empirical evaluations conducted on four real-world datasets demonstrate that BugLens significantly outperforms the state-of-the-art analysis-based methodologies Tamer and D3 by saving an average of 26.98% and 9.64% human effort to identify the same number of distinct bugs. Given the inherent simplicity and generalizability of bisection, it presents a highly practical solution for compiler bug deduplication in real-world applications.

Paper Structure

This paper contains 23 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Bug discovery curves of Bisection-Sole, Tamer, D3 and the baseline on the GCC-4.3.0 dataset.
  • Figure 2: Bug discovery curves of BugLens, Tamer, D3 and the baseline on the GCC-4.3.0 dataset.
  • Figure 3: Bug discovery curves of Bisection-Sole, BugLens, and the baseline on the unminimized GCC-4.3.0 dataset.
  • Figure 4: Number of times each compiler version being examined by different test programs during bisection in each datasets.