Table of Contents
Fetching ...

MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases

Congying Xu, Valerio Terragni, Hengcheng Zhu, Jiarong Wu, Shing-Chi Cheung

TL;DR

MR-Scout tackles the oracle problem in automated testing by automatically discovering metamorphic relations encoded in developer-written tests and synthesizing them into codified, parameterized MR methods. Through a three-phase pipeline—MTC discovery, MR synthesis, and MR filtering—it produces high-quality, reusable MR-based test oracles that can be integrated with automated input generators like EvoSuite. Empirically, MR-Scout identifies over 11,000 MTCs across 701 OSS projects with 97% precision, and codified MRs substantially improve test adequacy (e.g., up to 13.52% line coverage and 9.42% mutation-score gains) when combined with existing tests. A qualitative study shows that a majority of codified MRs are comprehensible to developers, supporting practical adoption for test maintenance and migration.

Abstract

Metamorphic Testing (MT) alleviates the oracle problem by defining oracles based on metamorphic relations (MRs), that govern multiple related inputs and their outputs. However, designing MRs is challenging, as it requires domain-specific knowledge. This hinders the widespread adoption of MT. We observe that developer-written test cases can embed domain knowledge that encodes MRs. Such encoded MRs could be synthesized for testing not only their original programs but also other programs that share similar functionalities. In this paper, we propose MR-Scout to automatically synthesize MRs from test cases in open-source software (OSS) projects. MR-Scout first discovers MR-encoded test cases (MTCs), and then synthesizes the encoded MRs into parameterized methods (called codified MRs), and filters out MRs that demonstrate poor quality for new test case generation. MR-Scout discovered over 11,000 MTCs from 701 OSS projects. Experimental results show that over 97% of codified MRs are of high quality for automated test case generation, demonstrating the practical applicability of MR-Scout. Furthermore, codified-MRs-based tests effectively enhance the test adequacy of programs with developer-written tests, leading to 13.52% and 9.42% increases in line coverage and mutation score, respectively. Our qualitative study shows that 55.76% to 76.92% of codified MRs are easily comprehensible for developers.

MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases

TL;DR

MR-Scout tackles the oracle problem in automated testing by automatically discovering metamorphic relations encoded in developer-written tests and synthesizing them into codified, parameterized MR methods. Through a three-phase pipeline—MTC discovery, MR synthesis, and MR filtering—it produces high-quality, reusable MR-based test oracles that can be integrated with automated input generators like EvoSuite. Empirically, MR-Scout identifies over 11,000 MTCs across 701 OSS projects with 97% precision, and codified MRs substantially improve test adequacy (e.g., up to 13.52% line coverage and 9.42% mutation-score gains) when combined with existing tests. A qualitative study shows that a majority of codified MRs are comprehensible to developers, supporting practical adoption for test maintenance and migration.

Abstract

Metamorphic Testing (MT) alleviates the oracle problem by defining oracles based on metamorphic relations (MRs), that govern multiple related inputs and their outputs. However, designing MRs is challenging, as it requires domain-specific knowledge. This hinders the widespread adoption of MT. We observe that developer-written test cases can embed domain knowledge that encodes MRs. Such encoded MRs could be synthesized for testing not only their original programs but also other programs that share similar functionalities. In this paper, we propose MR-Scout to automatically synthesize MRs from test cases in open-source software (OSS) projects. MR-Scout first discovers MR-encoded test cases (MTCs), and then synthesizes the encoded MRs into parameterized methods (called codified MRs), and filters out MRs that demonstrate poor quality for new test case generation. MR-Scout discovered over 11,000 MTCs from 701 OSS projects. Experimental results show that over 97% of codified MRs are of high quality for automated test case generation, demonstrating the practical applicability of MR-Scout. Furthermore, codified-MRs-based tests effectively enhance the test adequacy of programs with developer-written tests, leading to 13.52% and 9.42% increases in line coverage and mutation score, respectively. Our qualitative study shows that 55.76% to 76.92% of codified MRs are easily comprehensible for developers.
Paper Structure (38 sections, 3 equations, 13 figures, 2 tables)

This paper contains 38 sections, 3 equations, 13 figures, 2 tables.

Figures (13)

  • Figure 1: A test case crafted from com.itextpdf.layout.renderer.TextRendererTest in project iText. Underlying MR: $\mathit{\textit{IF}\ text_2=text_1.setBold()\ \textit{THEN}}\ \mathit{text_1.width()\leq text_2.width()}$.
  • Figure 2: A test case crafted from com.conversantmedia.util.concurrent.ConcurrentStackTest in project Disruptor. Underlying MR: $\mathit{x=stack.push(x).pop()}$ --- IF an element $x$ is pushed onto a stack and the stack subsequently pops off the top element, THEN the element $x$ should be the one popped.
  • Figure 3: Illustration of a wrapper function $f_c$ for a stack class implemented with methods push and pop. (The output of fc("push",x) is a stack object which has just pushed arg into it, while the output of fc("pop",x) are the popped element by executing stack.pop()and the stack object which has just popped an element.)
  • Figure 4: Overview of MR-Scout
  • Figure 5: Procedure of MR-Scout operating on the MTC simulateWidth()
  • ...and 8 more figures

Theorems & Definitions (5)

  • Example 2.1
  • Example 2.2
  • Example 2.3
  • Example 3.1
  • Example 3.4