Table of Contents
Fetching ...

Validity-Preserving Delta Debugging via Generator Trace Reduction

Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, Tao Xie

TL;DR

This work tackles the validity problem in delta debugging when test inputs must adhere to rich specifications. It introduces GReduce, a generator-based delta debugging framework that reduces the execution trace of a test input generator to produce smaller, valid inputs while preserving bug manifestation. The approach combines trace instrumentation, trace-aware reduction with loop and selection patterns, and trace-aligned re-execution to synthesize reduced inputs efficiently. Empirical evaluation across graphs, deep learning models, JavaScript programs, SymPy, and SmartCheck demonstrates that GReduce substantially outperforms state-of-the-art syntax-based reducers and other baselines in both effectiveness and efficiency, with modest instrumentation overhead. The results indicate broad applicability, robustness, and practical impact for domain-specific test input reduction when inputs are governed by complex specifications.

Abstract

Reducing test inputs that trigger bugs is crucial for efficient debugging. Delta debugging is the most popular approach for this purpose. When test inputs need to conform to certain specifications, existing delta debugging practice encounters a validity problem: it blindly applies reduction rules, producing a large number of invalid test inputs that do not satisfy the required specifications. This overall diminishing effectiveness and efficiency becomes even more pronounced when the specifications extend beyond syntactical structures. Our key insight is that we should leverage input generators, which are aware of these specifications, to generate valid reduced inputs, rather than straightforwardly performing reduction on test inputs. In this paper, we propose a generator-based delta debugging method, namely GReduce, which derives validity-preserving reducers. Specifically, given a generator and its execution, demonstrating how the bug-inducing test input is generated, GReduce searches for other executions on the generator that yield reduced, valid test inputs. The evaluation results on five benchmarks (i.e., graphs, DL models, JavaScript programs, SymPy, and algebraic data types) show that GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD, and also outperforms QuickCheck, SmartCheck, as well as the state-of-the-art choice-sequence-based reducer Hypothesis, demonstrating the effectiveness, efficiency, and versatility of GReduce.

Validity-Preserving Delta Debugging via Generator Trace Reduction

TL;DR

This work tackles the validity problem in delta debugging when test inputs must adhere to rich specifications. It introduces GReduce, a generator-based delta debugging framework that reduces the execution trace of a test input generator to produce smaller, valid inputs while preserving bug manifestation. The approach combines trace instrumentation, trace-aware reduction with loop and selection patterns, and trace-aligned re-execution to synthesize reduced inputs efficiently. Empirical evaluation across graphs, deep learning models, JavaScript programs, SymPy, and SmartCheck demonstrates that GReduce substantially outperforms state-of-the-art syntax-based reducers and other baselines in both effectiveness and efficiency, with modest instrumentation overhead. The results indicate broad applicability, robustness, and practical impact for domain-specific test input reduction when inputs are governed by complex specifications.

Abstract

Reducing test inputs that trigger bugs is crucial for efficient debugging. Delta debugging is the most popular approach for this purpose. When test inputs need to conform to certain specifications, existing delta debugging practice encounters a validity problem: it blindly applies reduction rules, producing a large number of invalid test inputs that do not satisfy the required specifications. This overall diminishing effectiveness and efficiency becomes even more pronounced when the specifications extend beyond syntactical structures. Our key insight is that we should leverage input generators, which are aware of these specifications, to generate valid reduced inputs, rather than straightforwardly performing reduction on test inputs. In this paper, we propose a generator-based delta debugging method, namely GReduce, which derives validity-preserving reducers. Specifically, given a generator and its execution, demonstrating how the bug-inducing test input is generated, GReduce searches for other executions on the generator that yield reduced, valid test inputs. The evaluation results on five benchmarks (i.e., graphs, DL models, JavaScript programs, SymPy, and algebraic data types) show that GReduce substantially outperforms state-of-the-art syntax-based reducers including Perses and T-PDD, and also outperforms QuickCheck, SmartCheck, as well as the state-of-the-art choice-sequence-based reducer Hypothesis, demonstrating the effectiveness, efficiency, and versatility of GReduce.
Paper Structure (43 sections, 27 equations, 8 figures, 4 tables, 2 algorithms)

This paper contains 43 sections, 27 equations, 8 figures, 4 tables, 2 algorithms.

Figures (8)

  • Figure 1: Reduction steps and derived test inputs.
  • Figure 2: The overall workflow demonstrating how GReduce finds a reduced generated inputs by aligning the trace of re-execution $T^{\prime}$ with the original given trace $T$.
  • Figure 3: A generator and its trace structure under tree-based trace reduction.
  • Figure 4: An example of infeasible trace alignment and how different strategies work on it.
  • Figure 5: Results of the reduction quality ($Size \slash Size_{o}$) and reduction time ($Time$) on graphs.
  • ...and 3 more figures

Theorems & Definitions (5)

  • definition 1: Generator
  • definition 2: State and Operation
  • definition 3: Execution
  • definition 4: Trace
  • definition 5: Delta Debugging