FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests

Shizhe Lin; Ryan Zheng He Liu; Ladan Tahvildari

FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests

Shizhe Lin, Ryan Zheng He Liu, Ladan Tahvildari

TL;DR

This research work proposes a novel categorization framework, called FlaKat, which uses machine-learning classifiers for fast and accurate prediction of the category of a given flaky test that reflects its root cause.

Abstract

Flaky tests can pass or fail non-deterministically, without alterations to a software system. Such tests are frequently encountered by developers and hinder the credibility of test suites. State-of-the-art research incorporates machine learning solutions into flaky test detection and achieves reasonably good accuracy. Moreover, the majority of automated flaky test repair solutions are designed for specific types of flaky tests. This research work proposes a novel categorization framework, called FlaKat, which uses machine-learning classifiers for fast and accurate prediction of the category of a given flaky test that reflects its root cause. Sampling techniques are applied to address the imbalance between flaky test categories in the International Dataset of Flaky Test (IDoFT). A new evaluation metric, called Flakiness Detection Capacity (FDC), is proposed for measuring the accuracy of classifiers from the perspective of information theory and provides proof for its effectiveness. The final FDC results are also in agreement with F1 score regarding which classifier yields the best flakiness classification.

FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests

TL;DR

Abstract

Paper Structure (18 sections, 3 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 18 sections, 3 equations, 4 figures, 5 tables, 1 algorithm.

Introduction
FlaKat Framework
Motivation
Workflow and Implementation
Research Objectives
Evaluation
Dataset
Vector Embedding Generations
Dimensionality Reduction
Qualitative Analysis
Quantitative Analysis
Prediction Effectiveness
$F_1$ score
Category-specific Results
Flakiness Detection Capacity
...and 3 more sections

Figures (4)

Figure 1: The original distribution of categories of flaky tests in the dataset.
Figure 2: Visualization of data points in reduced vector space.
Figure 3: Performance of embeddings on classifiers measured in $F_1$ score
Figure 4: Performance of embedding on classifiers measured in FDC

FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests

TL;DR

Abstract

FlaKat: A Machine Learning-Based Categorization Framework for Flaky Tests

Authors

TL;DR

Abstract

Table of Contents

Figures (4)