Table of Contents
Fetching ...

Studying the Impact of Early Test Termination Due to Assertion Failure on Code Coverage and Spectrum-based Fault Localization

Md. Ashraf Uddin, Shaowei Wang, An Ran Chen, Tse-Hsun, Chen, Muhammad Asaduzzaman

TL;DR

This study investigates how early test termination, caused by assertion failures within multi-assertion tests, degrades code coverage and the effectiveness of spectrum-based fault localization (SBFL). Through an empirical study of 207 versions from six open-source projects, the authors quantify the prevalence of early termination (19.1% of failed tests) and its impact on coverage across instruction, line, branch, and method levels. They propose two mitigation approaches, Trycatch and Slicing, to enforce continued test execution and evaluate their effect on SBFL metrics (EXAM, Top@k, MFR) using Ochiai and Tarantula; results show that Slicing yields the strongest improvements, increasing SBFL accuracy by up to ~15% in MFR on average. The findings offer practical guidance to practitioners, advocating single-assertion tests or slicing to preserve coverage information and improve fault localization, with data and replication resources made publicly available.

Abstract

An assertion is commonly used to validate the expected programs behavior (e.g., if the returned value of a method equals an expected value) in software testing. Although it is a recommended practice to use only one assertion in a single test to avoid code smells (e.g., Assertion Roulette), it is common to have multiple assertions in a single test. One issue with tests that have multiple assertions is that when the test fails at an early assertion (not the last one), the test will terminate at that point, and the remaining testing code will not be executed. This, in turn, can potentially reduce the code coverage and the performance of techniques that rely on code coverage information (e.g., spectrum-based fault localization). We refer to such a scenario as early test termination. Understanding the impact of early test termination on test coverage is important for software testing and debugging, particularly for the techniques that rely on coverage information obtained from the testing. We conducted the first empirical study on early test termination due to assertion failure (i.e., early test termination) by investigating 207 versions of 6 open-source projects. We found that a nonnegligible portion of the failed tests (19.1%) is early terminated due to assertion failure. Our findings indicate that early test termination harms both code coverage and the effectiveness of spectrum-based fault localization. For instance, after eliminating early test termination, the line/branch coverage is improved in 55% of the studied versions, and improves the performance of two popular SBFL techniques Ochiai and Tarantula by 15.1% and 10.7% compared to the original setting (without eliminating early test termination) in terms of MFR, respectively.

Studying the Impact of Early Test Termination Due to Assertion Failure on Code Coverage and Spectrum-based Fault Localization

TL;DR

This study investigates how early test termination, caused by assertion failures within multi-assertion tests, degrades code coverage and the effectiveness of spectrum-based fault localization (SBFL). Through an empirical study of 207 versions from six open-source projects, the authors quantify the prevalence of early termination (19.1% of failed tests) and its impact on coverage across instruction, line, branch, and method levels. They propose two mitigation approaches, Trycatch and Slicing, to enforce continued test execution and evaluate their effect on SBFL metrics (EXAM, Top@k, MFR) using Ochiai and Tarantula; results show that Slicing yields the strongest improvements, increasing SBFL accuracy by up to ~15% in MFR on average. The findings offer practical guidance to practitioners, advocating single-assertion tests or slicing to preserve coverage information and improve fault localization, with data and replication resources made publicly available.

Abstract

An assertion is commonly used to validate the expected programs behavior (e.g., if the returned value of a method equals an expected value) in software testing. Although it is a recommended practice to use only one assertion in a single test to avoid code smells (e.g., Assertion Roulette), it is common to have multiple assertions in a single test. One issue with tests that have multiple assertions is that when the test fails at an early assertion (not the last one), the test will terminate at that point, and the remaining testing code will not be executed. This, in turn, can potentially reduce the code coverage and the performance of techniques that rely on code coverage information (e.g., spectrum-based fault localization). We refer to such a scenario as early test termination. Understanding the impact of early test termination on test coverage is important for software testing and debugging, particularly for the techniques that rely on coverage information obtained from the testing. We conducted the first empirical study on early test termination due to assertion failure (i.e., early test termination) by investigating 207 versions of 6 open-source projects. We found that a nonnegligible portion of the failed tests (19.1%) is early terminated due to assertion failure. Our findings indicate that early test termination harms both code coverage and the effectiveness of spectrum-based fault localization. For instance, after eliminating early test termination, the line/branch coverage is improved in 55% of the studied versions, and improves the performance of two popular SBFL techniques Ochiai and Tarantula by 15.1% and 10.7% compared to the original setting (without eliminating early test termination) in terms of MFR, respectively.

Paper Structure

This paper contains 23 sections, 5 figures, 7 tables.

Figures (5)

  • Figure 1: A motivating example of early test termination occurs when an assertion fails on line 13, causing all the remaining tests and assertions to be skipped.
  • Figure 2: The overview framework for answering the RQs.
  • Figure 3: An example of test after applying Trycatch on the motivating example.
  • Figure 4: Example of sliced tests based on the motivating example. Due to space limitation, we only present two of them.
  • Figure 5: The function which is tested by the code in Figure \ref{['lst:example1']}.