Table of Contents
Fetching ...

Requirements Coverage-Guided Minimization for Natural Language Test Cases

Rongqi Pan, Feifei Niu, Lionel C. Briand, Hanyang Hu

TL;DR

This work addresses redundancy in requirement-based automotive test suites by introducing RTM, a minimization approach that preserves full requirement coverage under a fixed budget. RTM combines three preprocessing options, seven text-embedding methods, and multiple similarity measures to convert NL test cases into vector representations, which are then optimized via a genetic algorithm with coverage-aware initialization strategies. On an industrial automotive dataset, RTM consistently achieves higher fault-detection-rate than baselines while maintaining 100% requirement coverage, and its performance is shown to be robust to varying redundancy levels and scalable in runtime. The study also provides open replication materials to foster reproducibility and further research in requirement-driven testing under practical constraints.

Abstract

As software systems evolve, test suites tend to grow in size and often contain redundant test cases. Such redundancy increases testing effort, time, and cost. Test suite minimization (TSM) aims to eliminate such redundancy while preserving key properties such as requirement coverage and fault detection capability. In this paper, we propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing (validation), which can effectively reduce test suite redundancy while ensuring full requirement coverage and a high fault detection rate (FDR) under a fixed minimization budget. Based on common practice in critical systems where functional safety is important, we assume test cases are specified in natural language and traced to requirements before being implemented. RTM preprocesses test cases using three different preprocessing methods, and then converts them into vector representations using seven text embedding techniques. Similarity values between vectors are computed utilizing three distance functions. A Genetic Algorithm, whose population is initialized by coverage-preserving initialization strategies, is then employed to identify an optimized subset containing diverse test cases matching the set budget. We evaluate RTM on an industrial automotive system dataset comprising $736$ system test cases and $54$ requirements. Experimental results show that RTM consistently outperforms baseline techniques in terms of FDR across different minimization budgets while maintaining full requirement coverage. Furthermore, we investigate the impact of test suite redundancy levels on the effectiveness of TSM, providing new insights into optimizing requirement-based test suites under practical constraints.

Requirements Coverage-Guided Minimization for Natural Language Test Cases

TL;DR

This work addresses redundancy in requirement-based automotive test suites by introducing RTM, a minimization approach that preserves full requirement coverage under a fixed budget. RTM combines three preprocessing options, seven text-embedding methods, and multiple similarity measures to convert NL test cases into vector representations, which are then optimized via a genetic algorithm with coverage-aware initialization strategies. On an industrial automotive dataset, RTM consistently achieves higher fault-detection-rate than baselines while maintaining 100% requirement coverage, and its performance is shown to be robust to varying redundancy levels and scalable in runtime. The study also provides open replication materials to foster reproducibility and further research in requirement-driven testing under practical constraints.

Abstract

As software systems evolve, test suites tend to grow in size and often contain redundant test cases. Such redundancy increases testing effort, time, and cost. Test suite minimization (TSM) aims to eliminate such redundancy while preserving key properties such as requirement coverage and fault detection capability. In this paper, we propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing (validation), which can effectively reduce test suite redundancy while ensuring full requirement coverage and a high fault detection rate (FDR) under a fixed minimization budget. Based on common practice in critical systems where functional safety is important, we assume test cases are specified in natural language and traced to requirements before being implemented. RTM preprocesses test cases using three different preprocessing methods, and then converts them into vector representations using seven text embedding techniques. Similarity values between vectors are computed utilizing three distance functions. A Genetic Algorithm, whose population is initialized by coverage-preserving initialization strategies, is then employed to identify an optimized subset containing diverse test cases matching the set budget. We evaluate RTM on an industrial automotive system dataset comprising system test cases and requirements. Experimental results show that RTM consistently outperforms baseline techniques in terms of FDR across different minimization budgets while maintaining full requirement coverage. Furthermore, we investigate the impact of test suite redundancy levels on the effectiveness of TSM, providing new insights into optimizing requirement-based test suites under practical constraints.

Paper Structure

This paper contains 29 sections, 14 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A sanitized example of the test case in our dataset
  • Figure 2: Overall Framework of RTM.
  • Figure 3: Comparison of FDR across minimization budgets for RTM, FAST-R, and Random Minimization, alongside the theoretical upper bound.
  • Figure 4: Comparison of FDR Across Varying Redundancy Levels for RTM and Baseline Approaches Under Seven Minimization Budgets
  • Figure 5: Scatter plots of the number of test cases and preparation time, search time and total minimization time (in sec) for RTM and baselines under 50% minimization budget