Automatically Generating Single-Responsibility Unit Tests
Geraldine Galindo-Gutierrez
TL;DR
The paper tackles the problem that automatic test generation prioritizes coverage at the cost of understandability, leading to tests whose focal method is unclear. It proposes focal-method–oriented test representations that enforce the single-responsibility principle by construction, distinguishing setup from behavior and tying assertions to the focal method, with implementations in EvoSuite using DynaMOSA. The work outlines four research questions addressing coverage, single-responsibility, coherence of assertions with focal methods, and developer-based understandability assessments. If successful, the approach could enhance the adoptability of test generation tools in industry by producing more readable and maintainable tests without materially reducing effectiveness.
Abstract
Automatic test generation aims to save developers time and effort by producing test suites with reasonably high coverage and fault detection. However, the focus of search-based generation tools in maximizing coverage leaves other properties, such as test quality, coincidental. The evidence shows that developers remain skeptical of using generated tests as they face understandability challenges. Generated tests do not follow a defined structure while evolving, which can result in tests that contain method calls to improve coverage but lack a clear relation to the generated assertions. In my doctoral research, I aim to investigate the effects of providing a pre-process structure to the generated tests, based on the single-responsibility principle to favor the identification of the focal method under test. To achieve this, we propose to implement different test representations for evolution and evaluate their impact on coverage, fault detection, and understandability. We hypothesize that improving the structure of generated tests will report positive effects on the tests' understandability without significantly affecting the effectiveness. We aim to conduct a quantitative analysis of this proposed approach as well as a developer evaluation of the understandability of these tests.
