Latent Mutants: A large-scale study on the Interplay between mutation testing and software evolution

Jeongju Sohn; Ezekiel Soremekun; Michail Papadakis

Latent Mutants: A large-scale study on the Interplay between mutation testing and software evolution

Jeongju Sohn, Ezekiel Soremekun, Michail Papadakis

TL;DR

This study introduces latent mutants—the subset of mutation-test mutants that stay live across a version but are killed in later revisions—and situates them within software evolution. By analyzing 131,308 mutants across 13 Defects4J Java projects, the authors show latent mutants exist in most projects, endure about $104$ days on average, and can be predicted from historical code-change features and mutation operators using a Random Forest model with $acc=0.87$ and $bal ext{-}acc=0.67$, plus a MAP above baseline for ranking. The work reveals that latent mutants often arise due to changes in dependent code rather than the mutated lines themselves and that certain mutation operators are more prone to yield latent mutants, connecting mutation testing with long-term code maintenance. Overall, latent mutants offer a practical avenue to focus mutation testing and test development on mutations that are most likely to reveal future faults, enabling durable improvements to test suites in evolving software.

Abstract

In this paper we apply mutation testing in an in-time fashion, i.e., across multiple project releases. Thus, we investigate how the mutants of the current version behave in the future versions of the programs. We study the characteristics of what we call latent mutants, i.e., the mutants that are live in one version and killed in later revisions, and explore whether they are predictable with these properties. We examine 131,308 mutants generated by Pitest on 13 open-source projects. Around 11.2% of these mutants are live, and 3.5% of them are latent, manifesting in 104 days on average. Using the mutation operators and change-related features we successfully demonstrate that these latent mutants are identifiable, predicting them with an accuracy of 86% and a balanced accuracy of 67% using a simple random forest classifier.

Latent Mutants: A large-scale study on the Interplay between mutation testing and software evolution

TL;DR

days on average, and can be predicted from historical code-change features and mutation operators using a Random Forest model with

and

, plus a MAP above baseline for ranking. The work reveals that latent mutants often arise due to changes in dependent code rather than the mutated lines themselves and that certain mutation operators are more prone to yield latent mutants, connecting mutation testing with long-term code maintenance. Overall, latent mutants offer a practical avenue to focus mutation testing and test development on mutations that are most likely to reveal future faults, enabling durable improvements to test suites in evolving software.

Abstract

Paper Structure (34 sections, 1 equation, 4 figures, 10 tables)

This paper contains 34 sections, 1 equation, 4 figures, 10 tables.

Introduction
Background
Mutation Testing
Software Evolution
Mutation Testing & Software Evolution
Overview
Key Insight
Motivating Example
Terminology and Methodology
Definition of Terms
Code Evolution and Mutant Propagation
Experimental Settings
Research Questions
Subjects
Mutation Testing
...and 19 more sections

Figures (4)

Figure 1: Live mutants at $t_1$, $M_2$, $M_3$ and $M_4$, are propagated to the next versions. Red and blue denote the deletion and the refactoring of code; green refers to semantic changes to the file. $M_2$ is revealed by a new test case at $t_3$, $M_4$ is deleted at $t_2$ and $M_3$ remains undetected.
Figure 2: Prevalence of latent mutants, non-latent mutants and discarded mutants for each mutation operator (aka mutator). CB, VMC and MATH are among the most useful mutators for initial live mutants, whereas NC, NRET, INCR, ERET, and BRET are those useful to generate killed mutants.
Figure 3: (White marks are the mean, and the red lines are the median. )
Figure 4: Trends in change features showing that the trend for latent mutants is strong in recently observed and relatively old but frequently changed program statements for most projects, except Lang. (White marks are the mean, and the red lines are the median. )

Latent Mutants: A large-scale study on the Interplay between mutation testing and software evolution

TL;DR

Abstract

Latent Mutants: A large-scale study on the Interplay between mutation testing and software evolution

Authors

TL;DR

Abstract

Table of Contents

Figures (4)