Table of Contents
Fetching ...

Reduction of Test Re-runs by Prioritizing Potential Order Dependent Flaky Tests

Hasnain Iqbal, Zerina Begum, Kazi Sakib

TL;DR

This work tackles the reliability challenge posed by order-dependent (OD) flaky tests by introducing a static-analysis–based method to prioritize tests most likely to exhibit OD behavior. By analyzing shared static fields in test classes, the approach identifies candidate OD tests and uses a Tuscan Square–inspired intra-class ordering that focuses on these prioritized tests, thereby reducing unnecessary re-runs. In a study over 27 Java project modules with 189 OD tests, the method achieved an average 65.92% reduction in test executions and 72.19% fewer re-runs, while covering 177 of 189 OD tests (96.61% accuracy) across most modules; 23 modules achieved 100% OD coverage. The results demonstrate that OD detection can be made significantly more efficient by prioritizing tests based on static-field interactions, with potential for further gains by extending analysis to the code under test.

Abstract

Flaky tests can make automated software testing unreliable due to their unpredictable behavior. These tests can pass or fail on the same code base on multiple runs. However, flaky tests often do not refer to any fault, even though they can cause the continuous integration (CI) pipeline to fail. A common type of flaky test is the order-dependent (OD) test. The outcome of an OD test depends on the order in which it is run with respect to other test cases. Several studies have explored the detection and repair of OD tests. However, their methods require re-runs of tests multiple times, that are not related to the order dependence. Hence, prioritizing potential OD tests is necessary to reduce the re-runs. In this paper, we propose a method to prioritize potential order-dependent tests. By analyzing shared static fields in test classes, we identify tests that are more likely to be order-dependent. In our experiment on 27 project modules, our method successfully prioritized all OD tests in 23 cases, reducing test executions by an average of 65.92% and unnecessary re-runs by 72.19%. These results demonstrate that our approach significantly improves the efficiency of OD test detection by lowering execution costs.

Reduction of Test Re-runs by Prioritizing Potential Order Dependent Flaky Tests

TL;DR

This work tackles the reliability challenge posed by order-dependent (OD) flaky tests by introducing a static-analysis–based method to prioritize tests most likely to exhibit OD behavior. By analyzing shared static fields in test classes, the approach identifies candidate OD tests and uses a Tuscan Square–inspired intra-class ordering that focuses on these prioritized tests, thereby reducing unnecessary re-runs. In a study over 27 Java project modules with 189 OD tests, the method achieved an average 65.92% reduction in test executions and 72.19% fewer re-runs, while covering 177 of 189 OD tests (96.61% accuracy) across most modules; 23 modules achieved 100% OD coverage. The results demonstrate that OD detection can be made significantly more efficient by prioritizing tests based on static-field interactions, with potential for further gains by extending analysis to the code under test.

Abstract

Flaky tests can make automated software testing unreliable due to their unpredictable behavior. These tests can pass or fail on the same code base on multiple runs. However, flaky tests often do not refer to any fault, even though they can cause the continuous integration (CI) pipeline to fail. A common type of flaky test is the order-dependent (OD) test. The outcome of an OD test depends on the order in which it is run with respect to other test cases. Several studies have explored the detection and repair of OD tests. However, their methods require re-runs of tests multiple times, that are not related to the order dependence. Hence, prioritizing potential OD tests is necessary to reduce the re-runs. In this paper, we propose a method to prioritize potential order-dependent tests. By analyzing shared static fields in test classes, we identify tests that are more likely to be order-dependent. In our experiment on 27 project modules, our method successfully prioritized all OD tests in 23 cases, reducing test executions by an average of 65.92% and unnecessary re-runs by 72.19%. These results demonstrate that our approach significantly improves the efficiency of OD test detection by lowering execution costs.

Paper Structure

This paper contains 15 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Volume of unnecessary re-runs
  • Figure 2: Probable Order Dependent Test Pair, Test Class from project Activity
  • Figure 3: Pseudo Code of Prioritizing Candidate OD Tests
  • Figure 4: Number of test cases before and after reduction