Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study

Md Nadim; Mohammad Hassan; Ashis Kumar Mandal; Chanchal K. Roy; Banani Roy; Kevin A. Schneider

Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study

Md Nadim, Mohammad Hassan, Ashis Kumar Mandal, Chanchal K. Roy, Banani Roy, Kevin A. Schneider

TL;DR

The paper investigates the use of Quantum Support Vector Classifiers (QSVC) and Pegasos QSVC (PQSVC) for detecting buggy software commits, comparing them to classical SVC across 14 open-source projects (30,924 instances). It tackles scalability by chunking data into 500-instance subsets, training multiple chunk models, and aggregating their predictions with a tuned threshold, plus an incremental testing approach to mitigate quantum feature-mapping costs. Key findings show QSVC and PQSVC can be effective in short-data regimes (STAF), but QSVC faces scalability bottlenecks on larger datasets, where aggregation into a Global QSVC provides notable improvements in several projects; PQSVC often underperforms relative to SVC. The work highlights the promise and current limits of quantum machine learning for software defect prediction and offers a reproducible pipeline for further research and development in this domain.

Abstract

Purpose: Quantum computing promises to transform problem-solving across various domains with rapid and practical solutions. Within Software Evolution and Maintenance, Quantum Machine Learning (QML) remains mostly an underexplored domain, particularly in addressing challenges such as detecting buggy software commits from code repositories. Methods: In this study, we investigate the practical application of Quantum Support Vector Classifiers (QSVC) for detecting buggy software commits across 14 open-source software projects with diverse dataset sizes encompassing 30,924 data instances. We compare the QML algorithm PQSVC (Pegasos QSVC) and QSVC against the classical Support Vector Classifier (SVC). Our technique addresses large datasets in QSVC algorithms by dividing them into smaller subsets. We propose and evaluate an aggregation method to combine predictions from these models to detect the entire test dataset. We also introduce an incremental testing methodology to overcome the difficulties of quantum feature mapping during the testing approach. Results: The study shows the effectiveness of QSVC and PQSVC in detecting buggy software commits. The aggregation technique successfully combines predictions from smaller data subsets, enhancing the overall detection accuracy for the entire test dataset. The incremental testing methodology effectively manages the challenges associated with quantum feature mapping during the testing process. Conclusion: We contribute to the advancement of QML algorithms in defect prediction, unveiling the potential for further research in this domain. The specific scenario of the Short-Term Activity Frame (STAF) highlights the early detection of buggy software commits during the initial developmental phases of software systems, particularly when dataset sizes remain insufficient to train machine learning models.

Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study

TL;DR

Abstract

Paper Structure (13 sections, 1 equation, 9 figures, 2 tables)

This paper contains 13 sections, 1 equation, 9 figures, 2 tables.

Introduction
Background
Methodology
Dataset Preparation
Verifying the Short Term Activity Frame (STAF)
Train QSVC with Large Number of Dataset Instances
Determining the Aggregation Threshold
Tuning Chunk Models for Threshold Optimization
Testing the Model Performance
Result and Discussion
Threats to Validity
Related Work
Conclusion & Future Work

Figures (9)

Figure 1: Short-term Activity Frame (STAF)
Figure 2: Comparison of Precision, Recall, F1 Score of Quantum Random Forest (QRF) with popular classical ML (CML) algorithms. The datasets are in decreasing order as shown in the Table \ref{['tab:dataset-summary']}
Figure 3: Aggregation and Tuning Strategy of Chunk Models to Perform Global QSVC Prediction
Figure 4: Calculating Aggregation Threshold for Maximizing F-Score
Figure 5: Utilizing Precision-Recall Curve to Determine Optimal Tuning Threshold for Training Aggregated Chunk-QSVC into Global QSVC. Each data point on the curve presents the associated threshold, precision, recall, and F-score values.
...and 4 more figures

Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study

TL;DR

Abstract

Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study

Authors

TL;DR

Abstract

Table of Contents

Figures (9)