Table of Contents
Fetching ...

Test Adequacy for Metamorphic Testing: Criteria, Measurement, and Implication

An Fu, Chang-ai Sun, Jiaming Zhang, Huai Liu

TL;DR

This work addresses the gap in assessing metamorphic testing (MT) adequacy by introducing a $k$-MR coverage criterion that jointly accounts for metamorphic relations (MRs) and source inputs. It defines a formal MT adequacy measure, $C_{MT}^k$, that rewards diverse MR–source-input associations and varying test requirements, then validates the approach through large-scale empirical studies across seven programs and thousands of mutants. The results show that higher MT adequacy generally improves fault-detection effectiveness, with stronger coverage criteria (e.g., IO-CTF) achieving the best performance and revealing real-life defects. The findings offer practical guidelines for constructing effective MT test suites and provide a foundation for future work on MR/input selection and prioritization to balance effectiveness and cost.

Abstract

Metamorphic testing (MT) is a simple yet effective technique to alleviate the oracle problem in software testing. The underlying idea of MT is to test a software system by checking whether metamorphic relations (MRs) hold among multiple test inputs (including source and follow-up inputs) and the actual output of their executions. Since MRs and source inputs are two essential components of MT, considerable efforts have been made to examine the systematic identification of MRs and the effective generation of source inputs, which has greatly enriched the fundamental theory of MT since its invention. However, few studies have investigated the test adequacy assessment issue of MT, which hinders the objective measurement of MT's test quality as well as the effective construction of test suites. Although in the context of traditional software testing, there exist a number of test adequacy criteria that specify testing requirements to constitute an adequate test from various perspectives, they are not in line with MT's focus which is to test the software under testing (SUT) from the perspective of necessary properties. In this paper, we proposed a new set of criteria that specifies testing requirements from the perspective of necessary properties satisfied by the SUT, and designed a test adequacy measurement that evaluates the degree of adequacy based on both MRs and source inputs. The experimental results have shown that the proposed measurement can effectively indicate the fault detection effectiveness of test suites, i.e., test suites with increased test adequacy usually exhibit higher effectiveness in fault detection. Our work made an attempt to assess the test adequacy of MT from a new perspective, and our criteria and measurement provide a new approach to evaluate the test quality of MT and provide guidelines for constructing effective test suites of MT.

Test Adequacy for Metamorphic Testing: Criteria, Measurement, and Implication

TL;DR

This work addresses the gap in assessing metamorphic testing (MT) adequacy by introducing a -MR coverage criterion that jointly accounts for metamorphic relations (MRs) and source inputs. It defines a formal MT adequacy measure, , that rewards diverse MR–source-input associations and varying test requirements, then validates the approach through large-scale empirical studies across seven programs and thousands of mutants. The results show that higher MT adequacy generally improves fault-detection effectiveness, with stronger coverage criteria (e.g., IO-CTF) achieving the best performance and revealing real-life defects. The findings offer practical guidelines for constructing effective MT test suites and provide a foundation for future work on MR/input selection and prioritization to balance effectiveness and cost.

Abstract

Metamorphic testing (MT) is a simple yet effective technique to alleviate the oracle problem in software testing. The underlying idea of MT is to test a software system by checking whether metamorphic relations (MRs) hold among multiple test inputs (including source and follow-up inputs) and the actual output of their executions. Since MRs and source inputs are two essential components of MT, considerable efforts have been made to examine the systematic identification of MRs and the effective generation of source inputs, which has greatly enriched the fundamental theory of MT since its invention. However, few studies have investigated the test adequacy assessment issue of MT, which hinders the objective measurement of MT's test quality as well as the effective construction of test suites. Although in the context of traditional software testing, there exist a number of test adequacy criteria that specify testing requirements to constitute an adequate test from various perspectives, they are not in line with MT's focus which is to test the software under testing (SUT) from the perspective of necessary properties. In this paper, we proposed a new set of criteria that specifies testing requirements from the perspective of necessary properties satisfied by the SUT, and designed a test adequacy measurement that evaluates the degree of adequacy based on both MRs and source inputs. The experimental results have shown that the proposed measurement can effectively indicate the fault detection effectiveness of test suites, i.e., test suites with increased test adequacy usually exhibit higher effectiveness in fault detection. Our work made an attempt to assess the test adequacy of MT from a new perspective, and our criteria and measurement provide a new approach to evaluate the test quality of MT and provide guidelines for constructing effective test suites of MT.
Paper Structure (27 sections, 9 equations, 11 figures, 6 tables)

This paper contains 27 sections, 9 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Comparison between MT and conventional testing techniques. $p(t)$ denotes the actual output of executing input $t$ on $p$, $f(t)$ denotes the test oracle of $p(t)$. Figure 1(a) illustrates the test result verification mechanism of conventional testing techniques; figure 1(b) illustrates the test result verification mechanism of MT using an MR in relation two test inputs.
  • Figure 2: Illustration of $k$-MR coverage criterion. $R^{i}_{in}$ denotes the input subrelation of an MR and $R^{i}_{out}$ denotes the output subrelation of an MR.
  • Figure 3: Distribution of the fault detection effectiveness of $k$-MR coverage criterion under different values of $k$ on program PHONE
  • Figure 4: Distribution of the fault detection effectiveness of $k$-MR coverage criterion under different test case adequacy criteria on program print_tokens
  • Figure 5: Average of the fault detection rate of $k$-MR coverage criterion under different test case adequacy criteria on program grep
  • ...and 6 more figures

Theorems & Definitions (5)

  • Definition 1: Association relationship between a source input and an MR
  • Definition 2: Test Adequacy Criterion for MT
  • Definition 3: MRs covered by a source input
  • Definition 4: $k$-MR coverage criterion, $\mathcal{C}_{\rm{MT}}(\mathcal{C}_{c})$
  • Definition 5: Test adequacy measurement based on $k$-MR coverage criterion