Table of Contents
Fetching ...

QSMOTE-PGM/kPGM: QSMOTE Based PGM and kPGM for Imbalanced Dataset Classification

Bikash K. Behera, Giuseppe Sergioli, Robert Giuntini

TL;DR

This work investigates quantum-inspired classifiers for imbalanced data by pairing QSMOTE-based oversampling with PGM and kernelized PGM (kPGM). It introduces three QSMOTE variants—KNN-Based, Fidelity-Based, and Margin-Based—and benchmarks their performance against a Random Forest baseline using amplitude and stereo encodings and multiple quantum copies. The results show consistent improvements in recall and F1-score for the quantum-inspired methods, with PGM and KPGM delivering complementary advantages: PGM excels in encoding-dependent recall gains, while KPGM offers robust stability across sampling strategies. These findings illuminate the practical potential of quantum-inspired, kernel-based classification for imbalanced problems and offer design guidance for encoding, copy number, and sampling strategy choices.

Abstract

Quantum-inspired machine learning (QiML) leverages mathematical frameworks from quantum theory to enhance classical algorithms, with particular emphasis on inner product structures in high-dimensional feature spaces. Among the prominent approaches, the Kernel Trick, widely used in support vector machines, provides efficient similarity computation, while the Pretty Good Measurement (PGM), originating from quantum state discrimination, enables classification grounded in Hilbert space geometry. Building on recent developments in kernelized PGM (KPGM) and direct PGM-based classifiers, this work presents a unified theoretical and empirical comparison of these paradigms. We analyze their performance across synthetic oversampling scenarios using Quantum SMOTE (QSMOTE) variants. Experimental results show that both PGM and KPGM classifiers consistently outperform a classical random forest baseline, particularly when multiple quantum copies are employed. Notably, PGM with stereo encoding and n_copies=2 achieves the highest overall accuracy (0.8512) and F1-score (0.8234), while KPGM demonstrates competitive and more stable behavior across QSMOTE variants, with top scores of 0.8511 (stereo) and 0.8483 (amplitude). These findings highlight that quantum-inspired classifiers not only provide tangible gains in recall and balanced performance but also offer complementary strengths: PGM benefits from encoding-specific enhancements, whereas KPGM ensures robustness across sampling strategies. Our results advance the understanding of kernel-based and measurement-based QiML methods, offering practical guidance on their applicability under varying data characteristics and computational constraints.

QSMOTE-PGM/kPGM: QSMOTE Based PGM and kPGM for Imbalanced Dataset Classification

TL;DR

This work investigates quantum-inspired classifiers for imbalanced data by pairing QSMOTE-based oversampling with PGM and kernelized PGM (kPGM). It introduces three QSMOTE variants—KNN-Based, Fidelity-Based, and Margin-Based—and benchmarks their performance against a Random Forest baseline using amplitude and stereo encodings and multiple quantum copies. The results show consistent improvements in recall and F1-score for the quantum-inspired methods, with PGM and KPGM delivering complementary advantages: PGM excels in encoding-dependent recall gains, while KPGM offers robust stability across sampling strategies. These findings illuminate the practical potential of quantum-inspired, kernel-based classification for imbalanced problems and offer design guidance for encoding, copy number, and sampling strategy choices.

Abstract

Quantum-inspired machine learning (QiML) leverages mathematical frameworks from quantum theory to enhance classical algorithms, with particular emphasis on inner product structures in high-dimensional feature spaces. Among the prominent approaches, the Kernel Trick, widely used in support vector machines, provides efficient similarity computation, while the Pretty Good Measurement (PGM), originating from quantum state discrimination, enables classification grounded in Hilbert space geometry. Building on recent developments in kernelized PGM (KPGM) and direct PGM-based classifiers, this work presents a unified theoretical and empirical comparison of these paradigms. We analyze their performance across synthetic oversampling scenarios using Quantum SMOTE (QSMOTE) variants. Experimental results show that both PGM and KPGM classifiers consistently outperform a classical random forest baseline, particularly when multiple quantum copies are employed. Notably, PGM with stereo encoding and n_copies=2 achieves the highest overall accuracy (0.8512) and F1-score (0.8234), while KPGM demonstrates competitive and more stable behavior across QSMOTE variants, with top scores of 0.8511 (stereo) and 0.8483 (amplitude). These findings highlight that quantum-inspired classifiers not only provide tangible gains in recall and balanced performance but also offer complementary strengths: PGM benefits from encoding-specific enhancements, whereas KPGM ensures robustness across sampling strategies. Our results advance the understanding of kernel-based and measurement-based QiML methods, offering practical guidance on their applicability under varying data characteristics and computational constraints.

Paper Structure

This paper contains 19 sections, 25 equations, 10 figures, 6 tables, 2 algorithms.

Figures (10)

  • Figure 1: Schematic illustration of the three proposed QSMOTE variants. (a) KNN-based QSMOTE interpolates between a sample and its nearest neighbor. (b) Fidelity-based QSMOTE generates samples directed toward the cluster centroid using fidelity weighting. (c) Margin-based QSMOTE filters synthetic points near the decision boundary to retain only confident samples.
  • Figure 2: RF baseline performance across QSMOTE variants.
  • Figure 3: F1 (mean $\pm$ std) by QSMOTE variant for PGM with amplitude encoding.
  • Figure 4: F1 (mean $\pm$ std) by QSMOTE variant for PGM with stereo encoding.
  • Figure 5: Effect of $n\_copies$ on PGM performance across QSMOTE variants.
  • ...and 5 more figures