Table of Contents
Fetching ...

GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming

Jon Ayerdi, Valerio Terragni, Gunel Jahangirova, Aitor Arrieta, Paolo Tonella

TL;DR

Gen-Morph tackles the oracle problem in metamorphic testing by automatically generating metamorphic relations for Java methods using genetic programming. It builds MR candidates from input transformation templates, evaluates them against correct and mutant executions, and uses two fitness functions to minimize false positives and false negatives, with a co-evolutionary, multi-objective search and a two-stage filtering pipeline. Empirical results across 23 methods from three libraries show Gen-Morph produces effective MRs for most subjects, enhances fault detection when added to Randoop/Evosuite suites, and often outperforms AutoMR, with the filtering stage (Randoop and OASIs) reducing invalid MR reports. The work demonstrates that metamorphic oracles can be automated, reusable, and complementary to conventional test-case assertions, enabling more scalable regression testing and higher fault exposure in practice.

Abstract

Metamorphic testing is a popular approach that aims to alleviate the oracle problem in software testing. At the core of this approach are Metamorphic Relations (MRs), specifying properties that hold among multiple test inputs and corresponding outputs. Deriving MRs is mostly a manual activity, since their automated generation is a challenging and largely unexplored problem. This paper presents GenMorph, a technique to automatically generate MRs for Java methods that involve inputs and outputs that are boolean, numerical, or ordered sequences. GenMorph uses an evolutionary algorithm to search for effective test oracles, i.e., oracles that trigger no false alarms and expose software faults in the method under test. The proposed search algorithm is guided by two fitness functions that measure the number of false alarms and the number of missed faults for the generated MRs. Our results show that GenMorph generates effective MRs for 18 out of 23 methods (mutation score >20%). Furthermore, it can increase Randoop's fault detection capability in 7 out of 23 methods, and Evosuite's in 14 out of 23 methods. When compared with AutoMR, a state-of-the-art MR generator, GenMorph also outperformed its fault detection capability in 9 out of 10 methods.

GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming

TL;DR

Gen-Morph tackles the oracle problem in metamorphic testing by automatically generating metamorphic relations for Java methods using genetic programming. It builds MR candidates from input transformation templates, evaluates them against correct and mutant executions, and uses two fitness functions to minimize false positives and false negatives, with a co-evolutionary, multi-objective search and a two-stage filtering pipeline. Empirical results across 23 methods from three libraries show Gen-Morph produces effective MRs for most subjects, enhances fault detection when added to Randoop/Evosuite suites, and often outperforms AutoMR, with the filtering stage (Randoop and OASIs) reducing invalid MR reports. The work demonstrates that metamorphic oracles can be automated, reusable, and complementary to conventional test-case assertions, enabling more scalable regression testing and higher fault exposure in practice.

Abstract

Metamorphic testing is a popular approach that aims to alleviate the oracle problem in software testing. At the core of this approach are Metamorphic Relations (MRs), specifying properties that hold among multiple test inputs and corresponding outputs. Deriving MRs is mostly a manual activity, since their automated generation is a challenging and largely unexplored problem. This paper presents GenMorph, a technique to automatically generate MRs for Java methods that involve inputs and outputs that are boolean, numerical, or ordered sequences. GenMorph uses an evolutionary algorithm to search for effective test oracles, i.e., oracles that trigger no false alarms and expose software faults in the method under test. The proposed search algorithm is guided by two fitness functions that measure the number of false alarms and the number of missed faults for the generated MRs. Our results show that GenMorph generates effective MRs for 18 out of 23 methods (mutation score >20%). Furthermore, it can increase Randoop's fault detection capability in 7 out of 23 methods, and Evosuite's in 14 out of 23 methods. When compared with AutoMR, a state-of-the-art MR generator, GenMorph also outperformed its fault detection capability in 9 out of 10 methods.
Paper Structure (20 sections, 9 equations, 1 figure, 3 tables, 1 algorithm)

This paper contains 20 sections, 9 equations, 1 figure, 3 tables, 1 algorithm.

Figures (1)

  • Figure 1: Logical architecture of Gen-Morph for the automated generation of MRs

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5