A Catalog of Transformations to Remove Smells From Natural Language Tests
Manoel Aranda, Naelson Oliveira, Elvys Soares, Márcio Ribeiro, Davi Romão, Ullyanne Patriota, Rohit Gheyi, Emerson Souza, Ivan Machado
TL;DR
Natural language test smells undermine the quality of manual tests. The authors present seven NLP-based transformations and an automated tool to remove these smells, validated through a professional survey and a real-practice Ubuntu tests study, achieving a F-Measure of 83.70% and high practitioner acceptance. The work formalizes a Natural Language Test Template, demonstrates left-right transformation mappings, and delivers a replication package and a spaCy-based implementation. Together, the catalog and tool offer a practical path to improve the reliability and maintainability of natural language tests, with clear directions for extending smell coverage and enhancing automation through NLP advances.
Abstract
Test smells can pose difficulties during testing activities, such as poor maintainability, non-deterministic behavior, and incomplete verification. Existing research has extensively addressed test smells in automated software tests but little attention has been given to smells in natural language tests. While some research has identified and catalogued such smells, there is a lack of systematic approaches for their removal. Consequently, there is also a lack of tools to automatically identify and remove natural language test smells. This paper introduces a catalog of transformations designed to remove seven natural language test smells and a companion tool implemented using Natural Language Processing (NLP) techniques. Our work aims to enhance the quality and reliability of natural language tests during software development. The research employs a two-fold empirical strategy to evaluate its contributions. First, a survey involving 15 software testing professionals assesses the acceptance and usefulness of the catalog's transformations. Second, an empirical study evaluates our tool to remove natural language test smells by analyzing a sample of real-practice tests from the Ubuntu OS. The results indicate that software testing professionals find the transformations valuable. Additionally, the automated tool demonstrates a good level of precision, as evidenced by a F-Measure rate of 83.70%
