HITS: High-coverage LLM-based Unit Test Generation via Method Slicing
Zejun Wang, Kaibo Liu, Ge Li, Zhi Jin
TL;DR
This work tackles the challenge of achieving high coverage for complex focal methods in automatic Java unit test generation by decomposing methods into slices and generating tests per slice with LLM-guided chain-of-thought prompting (HITS). It combines static-context retrieval, problem-solution step decomposition, and post-processing (including self-debug fixes) to produce executable test suites. Evaluated on ten open-source projects against EvoSuite and multiple LLM baselines, HITS shows 10–20 percentage-point improvements in both line and branch coverage for complex methods, with ablation studies confirming the contribution of slicing, prompt design, and post-processing. The results underscore the practical potential of divide-and-conquer, slice-based prompting for LLM-driven test generation, while highlighting challenges in executable test quality and context learning that guide future enhancements.
Abstract
Large language models (LLMs) have behaved well in generating unit tests for Java projects. However, the performance for covering the complex focal methods within the projects is poor. Complex methods comprise many conditions and loops, requiring the test cases to be various enough to cover all lines and branches. However, existing test generation methods with LLMs provide the whole method-to-test to the LLM without assistance on input analysis. The LLM has difficulty inferring the test inputs to cover all conditions, resulting in missing lines and branches. To tackle the problem, we propose decomposing the focal methods into slices and asking the LLM to generate test cases slice by slice. Our method simplifies the analysis scope, making it easier for the LLM to cover more lines and branches in each slice. We build a dataset comprising complex focal methods collected from the projects used by existing state-of-the-art approaches. Our experiment results show that our method significantly outperforms current test case generation methods with LLMs and the typical SBST method Evosuite regarding both line and branch coverage scores.
