Defect Prediction Using Stylistic Metrics

Rafed Muhammad Yasir; Ahmedul Kabir

Defect Prediction Using Stylistic Metrics

Rafed Muhammad Yasir, Ahmedul Kabir

TL;DR

This work addresses defect prediction by incorporating programming stylistic metrics, a novel signal beyond traditional code and process metrics. It analyzes 60 stylistic features across 14 releases from 5 open-source C++ projects, using four classifiers ($NB$, $SVM$, $DT$, $LR$) with SMOTE balancing and $VIF$-based feature pruning, labeling buggy files via bug-fix commits and the $SZZ$ algorithm. Within-project results favor Decision Tree with a mean $F1$ of $78.33\%$, while cross-project results are best with DT and SVM at $F1$ means of $72.07\%$ and $72.57\%$, respectively; 6/9 within-project and 9/14 cross-project cases meet the predefined acceptance thresholds ($Recall>70\%$, $Precision>50\%$). The findings suggest stylistic metrics provide meaningful, complementary signals for defect proneness at the file level and offer a publicly available dataset for future exploration, with planned expansion to more cross-project configurations and integration with traditional defect predictors.

Abstract

Defect prediction is one of the most popular research topics due to its potential to minimize software quality assurance efforts. Existing approaches have examined defect prediction from various perspectives such as complexity and developer metrics. However, none of these consider programming style for defect prediction. This paper aims at analyzing the impact of stylistic metrics on both within-project and crossproject defect prediction. For prediction, 4 widely used machine learning algorithms namely Naive Bayes, Support Vector Machine, Decision Tree and Logistic Regression are used. The experiment is conducted on 14 releases of 5 popular, open source projects. F1, Precision and Recall are inspected to evaluate the results. Results reveal that stylistic metrics are a good predictor of defects.

Defect Prediction Using Stylistic Metrics

TL;DR

) with SMOTE balancing and

-based feature pruning, labeling buggy files via bug-fix commits and the

algorithm. Within-project results favor Decision Tree with a mean

, while cross-project results are best with DT and SVM at

means of

and

, respectively; 6/9 within-project and 9/14 cross-project cases meet the predefined acceptance thresholds (

). The findings suggest stylistic metrics provide meaningful, complementary signals for defect proneness at the file level and offer a publicly available dataset for future exploration, with planned expansion to more cross-project configurations and integration with traditional defect predictors.

Abstract

Paper Structure (11 sections, 3 equations, 1 figure, 5 tables)

This paper contains 11 sections, 3 equations, 1 figure, 5 tables.

Introduction
Background and Related Work
Methodology
Dataset
Approach
Experiment
Performance Evaluation
Result Analysis
Threats to Validity
Conclusion
Acknowledgement

Figures (1)

Figure 1: Training and Test Data Selection for Within-project and Cross-project Defect Prediction

Defect Prediction Using Stylistic Metrics

TL;DR

Abstract

Defect Prediction Using Stylistic Metrics

Authors

TL;DR

Abstract

Table of Contents

Figures (1)