MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases
Congying Xu, Valerio Terragni, Hengcheng Zhu, Jiarong Wu, Shing-Chi Cheung
TL;DR
MR-Scout tackles the oracle problem in automated testing by automatically discovering metamorphic relations encoded in developer-written tests and synthesizing them into codified, parameterized MR methods. Through a three-phase pipeline—MTC discovery, MR synthesis, and MR filtering—it produces high-quality, reusable MR-based test oracles that can be integrated with automated input generators like EvoSuite. Empirically, MR-Scout identifies over 11,000 MTCs across 701 OSS projects with 97% precision, and codified MRs substantially improve test adequacy (e.g., up to 13.52% line coverage and 9.42% mutation-score gains) when combined with existing tests. A qualitative study shows that a majority of codified MRs are comprehensible to developers, supporting practical adoption for test maintenance and migration.
Abstract
Metamorphic Testing (MT) alleviates the oracle problem by defining oracles based on metamorphic relations (MRs), that govern multiple related inputs and their outputs. However, designing MRs is challenging, as it requires domain-specific knowledge. This hinders the widespread adoption of MT. We observe that developer-written test cases can embed domain knowledge that encodes MRs. Such encoded MRs could be synthesized for testing not only their original programs but also other programs that share similar functionalities. In this paper, we propose MR-Scout to automatically synthesize MRs from test cases in open-source software (OSS) projects. MR-Scout first discovers MR-encoded test cases (MTCs), and then synthesizes the encoded MRs into parameterized methods (called codified MRs), and filters out MRs that demonstrate poor quality for new test case generation. MR-Scout discovered over 11,000 MTCs from 701 OSS projects. Experimental results show that over 97% of codified MRs are of high quality for automated test case generation, demonstrating the practical applicability of MR-Scout. Furthermore, codified-MRs-based tests effectively enhance the test adequacy of programs with developer-written tests, leading to 13.52% and 9.42% increases in line coverage and mutation score, respectively. Our qualitative study shows that 55.76% to 76.92% of codified MRs are easily comprehensible for developers.
