Table of Contents
Fetching ...

Enhancing LLM-Based Test Generation by Eliminating Covered Code

WeiZhe Xu, Mengyu Liu, Fanxin Kong

TL;DR

This work proposes a scalable LLM-based unit test generation method that outperforms state-of-the-art LLM-based and search-based methods, demonstrating its effectiveness in achieving high coverage on complex methods.

Abstract

Automated test generation is essential for software quality assurance, with coverage rate serving as a key metric to ensure thorough testing. Recent advancements in Large Language Models (LLMs) have shown promise in improving test generation, particularly in achieving higher coverage. However, while existing LLM-based test generation solutions perform well on small, isolated code snippets, they struggle when applied to complex methods under test. To address these issues, we propose a scalable LLM-based unit test generation method. Our approach consists of two key steps. The first step is context information retrieval, which uses both LLMs and static analysis to gather relevant contextual information associated with the complex methods under test. The second step, iterative test generation with code elimination, repeatedly generates unit tests for the code slice, tracks the achieved coverage, and selectively removes code segments that have already been covered. This process simplifies the testing task and mitigates issues arising from token limits or reduced reasoning effectiveness associated with excessively long contexts. Through comprehensive evaluations on open-source projects, our approach outperforms state-of-the-art LLM-based and search-based methods, demonstrating its effectiveness in achieving high coverage on complex methods.

Enhancing LLM-Based Test Generation by Eliminating Covered Code

TL;DR

This work proposes a scalable LLM-based unit test generation method that outperforms state-of-the-art LLM-based and search-based methods, demonstrating its effectiveness in achieving high coverage on complex methods.

Abstract

Automated test generation is essential for software quality assurance, with coverage rate serving as a key metric to ensure thorough testing. Recent advancements in Large Language Models (LLMs) have shown promise in improving test generation, particularly in achieving higher coverage. However, while existing LLM-based test generation solutions perform well on small, isolated code snippets, they struggle when applied to complex methods under test. To address these issues, we propose a scalable LLM-based unit test generation method. Our approach consists of two key steps. The first step is context information retrieval, which uses both LLMs and static analysis to gather relevant contextual information associated with the complex methods under test. The second step, iterative test generation with code elimination, repeatedly generates unit tests for the code slice, tracks the achieved coverage, and selectively removes code segments that have already been covered. This process simplifies the testing task and mitigates issues arising from token limits or reduced reasoning effectiveness associated with excessively long contexts. Through comprehensive evaluations on open-source projects, our approach outperforms state-of-the-art LLM-based and search-based methods, demonstrating its effectiveness in achieving high coverage on complex methods.
Paper Structure (25 sections, 1 theorem, 4 figures, 4 tables, 2 algorithms)

This paper contains 25 sections, 1 theorem, 4 figures, 4 tables, 2 algorithms.

Key Result

proposition 1

Let $P$ be the original target unit, $U$ the set of uncovered lines, and $P'$ the program produced by Algorithm 1. Assume that (1) the fine-grained CFG soundly captures all execution paths of $P$; (2) all data dependencies of lines in $U$ lie on execution paths leading to them; and (3) the reconstru

Figures (4)

  • Figure 1: The Overview of Our Framework. Our approach consists of two steps: context information retrieval and iterative test generation with covered code elimination, which are illustrated by the green and orange dashed boxes in the figure, respectively.
  • Figure 2: Internal Structure of LLM-based Test Generation without Elimination Component.
  • Figure 3: The Format of the Prompt. {{$\cdot$}} serves as a placeholder, which will be replaced by variables during actual use.
  • Figure 4: An Example of Eliminating Covered Code.

Theorems & Definitions (1)

  • proposition 1: Behavior Preservation