Table of Contents
Fetching ...

Automated Unit Test Refactoring

Yi Gao, Xing Hu, Xiaohu Yang, Xin Xia

TL;DR

This work tackles the problem of test smells in unit tests by introducing UTRefactor, a context-enhanced, LLM-based framework that uses an external knowledge base and a domain-specific language (DSL) to automate test refactoring in Java. The system combines preprocessing (test extraction, smell detection, and test context collection), a knowledge base (standardized smell definitions and 13 DSL-based refactoring rules), and CoT-guided refactoring with a checkpoint mechanism to ensure complete smell elimination. Evaluated on six open-source Java projects, UTRefactor reduced 2,375 smells to 265 (an 89% reduction) and outperformed baseline approaches, including a generic LLM prompt and TESTAXE, across multiple smell categories. The results demonstrate improved test quality and maintainable refactoring with practical implications for automated testing pipelines, while pointing to ongoing challenges such as handling certain smells and extending to other languages.

Abstract

Test smells arise from poor design practices and insufficient domain knowledge, which can lower the quality of test code and make it harder to maintain and update. Manually refactoring test smells is time-consuming and error-prone, highlighting the necessity for automated approaches. Current rule-based refactoring methods often struggle in scenarios not covered by predefined rules and lack the flexibility needed to handle diverse cases effectively. In this paper, we propose a novel approach called UTRefactor, a context-enhanced, LLM-based framework for automatic test refactoring in Java projects. UTRefactor extracts relevant context from test code and leverages an external knowledge base that includes test smell definitions, descriptions, and DSL-based refactoring rules. By simulating the manual refactoring process through a chain-of-thought approach, UTRefactor guides the LLM to eliminate test smells in a step-by-step process, ensuring both accuracy and consistency throughout the refactoring. Additionally, we implement a checkpoint mechanism to facilitate comprehensive refactoring, particularly when multiple smells are present. We evaluate UTRefactor on 879 tests from six open-source Java projects, reducing the number of test smells from 2,375 to 265, achieving an 89% reduction. UTRefactor outperforms direct LLM-based refactoring methods by 61.82% in smell elimination and significantly surpasses the performance of a rule-based test smell refactoring tool. Our results demonstrate the effectiveness of UTRefactor in enhancing test code quality while minimizing manual involvement.

Automated Unit Test Refactoring

TL;DR

This work tackles the problem of test smells in unit tests by introducing UTRefactor, a context-enhanced, LLM-based framework that uses an external knowledge base and a domain-specific language (DSL) to automate test refactoring in Java. The system combines preprocessing (test extraction, smell detection, and test context collection), a knowledge base (standardized smell definitions and 13 DSL-based refactoring rules), and CoT-guided refactoring with a checkpoint mechanism to ensure complete smell elimination. Evaluated on six open-source Java projects, UTRefactor reduced 2,375 smells to 265 (an 89% reduction) and outperformed baseline approaches, including a generic LLM prompt and TESTAXE, across multiple smell categories. The results demonstrate improved test quality and maintainable refactoring with practical implications for automated testing pipelines, while pointing to ongoing challenges such as handling certain smells and extending to other languages.

Abstract

Test smells arise from poor design practices and insufficient domain knowledge, which can lower the quality of test code and make it harder to maintain and update. Manually refactoring test smells is time-consuming and error-prone, highlighting the necessity for automated approaches. Current rule-based refactoring methods often struggle in scenarios not covered by predefined rules and lack the flexibility needed to handle diverse cases effectively. In this paper, we propose a novel approach called UTRefactor, a context-enhanced, LLM-based framework for automatic test refactoring in Java projects. UTRefactor extracts relevant context from test code and leverages an external knowledge base that includes test smell definitions, descriptions, and DSL-based refactoring rules. By simulating the manual refactoring process through a chain-of-thought approach, UTRefactor guides the LLM to eliminate test smells in a step-by-step process, ensuring both accuracy and consistency throughout the refactoring. Additionally, we implement a checkpoint mechanism to facilitate comprehensive refactoring, particularly when multiple smells are present. We evaluate UTRefactor on 879 tests from six open-source Java projects, reducing the number of test smells from 2,375 to 265, achieving an 89% reduction. UTRefactor outperforms direct LLM-based refactoring methods by 61.82% in smell elimination and significantly surpasses the performance of a rule-based test smell refactoring tool. Our results demonstrate the effectiveness of UTRefactor in enhancing test code quality while minimizing manual involvement.
Paper Structure (28 sections, 9 figures, 7 tables)

This paper contains 28 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: An example of test refactoring within the Gson.
  • Figure 2: Overview of our approach.
  • Figure 3: An example of test smell explanation of the LLM.
  • Figure 4: A hierarchical definition of the DSL structure for test refactoring rules.
  • Figure 5: DSL rules for refactoring three types of test smells, accompanied by code examples illustrating the test code before and after refactoring.
  • ...and 4 more figures