Table of Contents
Fetching ...

DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern

Lekang Yang, Yuetong Liu, Yitong Zhang, Jia Li

TL;DR

DiffTester tackles inefficiency in diffusion LLMs for unit test generation by mining repetitive AST-level patterns across multiple unit tests for the same focal method to guide token retention. It prompts dLLMs to generate several tests in parallel, then merges ASTs to identify shared nodes and retains corresponding tokens to accelerate decoding, with safeguards such as a confidence threshold and intermittent application. Experiments on TestEval across Python, Java, and C++ with DiffuCoder and Dream show substantial speedups (roughly 1.5–2x) while preserving line coverage, and in some cases improving maximum coverage. The approach generalizes across models and languages and is publicly released, offering a practical UTG acceleration solution for software quality assurance.

Abstract

Software development relies heavily on extensive unit testing, which makes the efficiency of automated Unit Test Generation (UTG) particularly important. However, most existing LLMs generate test cases one token at a time in each forward pass, which leads to inefficient UTG. Recently, diffusion LLMs (dLLMs) have emerged, offering promising parallel generation capabilities and showing strong potential for efficient UTG. Despite this advantage, their application to UTG is still constrained by a clear trade-off between efficiency and test quality, since increasing the number of tokens generated in each step often causes a sharp decline in the quality of test cases. To overcome this limitation, we present DiffTester, an acceleration framework specifically tailored for dLLMs in UTG. The key idea of DiffTester is that unit tests targeting the same focal method often share repetitive structural patterns. By dynamically identifying these common patterns through abstract syntax tree analysis during generation, DiffTester adaptively increases the number of tokens produced at each step without compromising the quality of the output. To enable comprehensive evaluation, we extend the original TestEval benchmark, which was limited to Python, by introducing additional programming languages including Java and C++. Extensive experiments on three benchmarks with two representative models show that DiffTester delivers significant acceleration while preserving test coverage. Moreover, DiffTester generalizes well across different dLLMs and programming languages, providing a practical and scalable solution for efficient UTG in software development. Code and data are publicly available at https://github.com/wellbeingyang/DLM4UTG-open .

DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern

TL;DR

DiffTester tackles inefficiency in diffusion LLMs for unit test generation by mining repetitive AST-level patterns across multiple unit tests for the same focal method to guide token retention. It prompts dLLMs to generate several tests in parallel, then merges ASTs to identify shared nodes and retains corresponding tokens to accelerate decoding, with safeguards such as a confidence threshold and intermittent application. Experiments on TestEval across Python, Java, and C++ with DiffuCoder and Dream show substantial speedups (roughly 1.5–2x) while preserving line coverage, and in some cases improving maximum coverage. The approach generalizes across models and languages and is publicly released, offering a practical UTG acceleration solution for software quality assurance.

Abstract

Software development relies heavily on extensive unit testing, which makes the efficiency of automated Unit Test Generation (UTG) particularly important. However, most existing LLMs generate test cases one token at a time in each forward pass, which leads to inefficient UTG. Recently, diffusion LLMs (dLLMs) have emerged, offering promising parallel generation capabilities and showing strong potential for efficient UTG. Despite this advantage, their application to UTG is still constrained by a clear trade-off between efficiency and test quality, since increasing the number of tokens generated in each step often causes a sharp decline in the quality of test cases. To overcome this limitation, we present DiffTester, an acceleration framework specifically tailored for dLLMs in UTG. The key idea of DiffTester is that unit tests targeting the same focal method often share repetitive structural patterns. By dynamically identifying these common patterns through abstract syntax tree analysis during generation, DiffTester adaptively increases the number of tokens produced at each step without compromising the quality of the output. To enable comprehensive evaluation, we extend the original TestEval benchmark, which was limited to Python, by introducing additional programming languages including Java and C++. Extensive experiments on three benchmarks with two representative models show that DiffTester delivers significant acceleration while preserving test coverage. Moreover, DiffTester generalizes well across different dLLMs and programming languages, providing a practical and scalable solution for efficient UTG in software development. Code and data are publicly available at https://github.com/wellbeingyang/DLM4UTG-open .

Paper Structure

This paper contains 40 sections, 19 figures, 3 tables, 2 algorithms.

Figures (19)

  • Figure 1: Repetitive structural and syntactic patterns frequently emerge in unit test cases generated at an intermediate step of dLLM inference before remasking.
  • Figure 2: Overview of our proposed DiffTester.
  • Figure 3: Extract shared nodes between two ASTs and locate their corresponding tokens in the generated code. Square boxes represent non-leaf nodes, while ellipses indicate leaf nodes. The colored tokens in the code at the top of the figure highlight the tokens that can be additionally retained according to the merged AST.
  • Figure 4: Comparison of line coverage with and without DiffTester at equal decoding time.
  • Figure 5: Case study on Dream illustrating the decoding process with and without DiffTester.
  • ...and 14 more figures