Table of Contents
Fetching ...

Embedding-Aware Quantum-Classical SVMs for Scalable Quantum Machine Learning

Sebastián Andrés Cajas Ordóñez, Luis Fernando Torres Torres, Mario Bifulco, Carlos Andrés Durán, Cristian Bosch, Ricardo Simón Carbajo

TL;DR

The paper tackles the scalability problem of quantum SVMs on high-dimensional data by proposing an embedding-aware hybrid QSVM pipeline that integrates class-balanced $k$-means distillation with pretrained embeddings. Using a 16-qubit tensor-network quantum kernel simulated via cuTensorNet, it shows that Vision Transformer embeddings yield quantum advantage, achieving up to $+8.02\%$ accuracy on Fashion-MNIST and $+4.42\%$ on MNIST, while CNN features underperform. The key insight is the strong synergy between transformer attention-based representations and quantum feature spaces, which enables practical, scalable quantum machine learning for high-dimensional tasks. The work provides a concrete pathway to deploy quantum kernels in real-world settings, leveraging modern neural architectures to maximize advantage while controlling computational costs.

Abstract

Quantum Support Vector Machines face scalability challenges due to high-dimensional quantum states and hardware limitations. We propose an embedding-aware quantum-classical pipeline combining class-balanced k-means distillation with pretrained Vision Transformer embeddings. Our key finding: ViT embeddings uniquely enable quantum advantage, achieving up to 8.02% accuracy improvements over classical SVMs on Fashion-MNIST and 4.42% on MNIST, while CNN features show performance degradation. Using 16-qubit tensor network simulation via cuTensorNet, we provide the first systematic evidence that quantum kernel advantage depends critically on embedding choice, revealing fundamental synergy between transformer attention and quantum feature spaces. This provides a practical pathway for scalable quantum machine learning that leverages modern neural architectures.

Embedding-Aware Quantum-Classical SVMs for Scalable Quantum Machine Learning

TL;DR

The paper tackles the scalability problem of quantum SVMs on high-dimensional data by proposing an embedding-aware hybrid QSVM pipeline that integrates class-balanced -means distillation with pretrained embeddings. Using a 16-qubit tensor-network quantum kernel simulated via cuTensorNet, it shows that Vision Transformer embeddings yield quantum advantage, achieving up to accuracy on Fashion-MNIST and on MNIST, while CNN features underperform. The key insight is the strong synergy between transformer attention-based representations and quantum feature spaces, which enables practical, scalable quantum machine learning for high-dimensional tasks. The work provides a concrete pathway to deploy quantum kernels in real-world settings, leveraging modern neural architectures to maximize advantage while controlling computational costs.

Abstract

Quantum Support Vector Machines face scalability challenges due to high-dimensional quantum states and hardware limitations. We propose an embedding-aware quantum-classical pipeline combining class-balanced k-means distillation with pretrained Vision Transformer embeddings. Our key finding: ViT embeddings uniquely enable quantum advantage, achieving up to 8.02% accuracy improvements over classical SVMs on Fashion-MNIST and 4.42% on MNIST, while CNN features show performance degradation. Using 16-qubit tensor network simulation via cuTensorNet, we provide the first systematic evidence that quantum kernel advantage depends critically on embedding choice, revealing fundamental synergy between transformer attention and quantum feature spaces. This provides a practical pathway for scalable quantum machine learning that leverages modern neural architectures.

Paper Structure

This paper contains 23 sections, 1 equation, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Illustrates the sequential steps from data extraction to QSVM evaluation. The process begins with image data extraction, followed by class-balanced $k$-means clustering to distill representative samples. Vector embeddings are then extracted using ImageNet-pretrained models such as EfficientNet or ViT. To reduce dimensionality and match quantum hardware constraints, PCA is applied to compress the embeddings. These processed embeddings are used to design a Quantum Support Vector Machine (QSVM) using the TNSM framework, which constructs a quantum kernel via a data re-uploading and compute–uncompute strategy. The model is trained and validated through cross-validation, then evaluated on a held-out test set.
  • Figure 2: Quantum circuit used in the QSVM pipeline using Qiskit. Each of the four qubits is initialized with a Hadamard gate, followed by data-encoding rotations using parameterized $R_Z$ and $R_Y$ gates. A sequence of CNOT gates creates entanglement between adjacent qubits, after which a second layer of $R_Z$ gates is applied. This structure forms an embedding-aware quantum feature map for encoding classical input features.
  • Figure 3: Violin plots show the distribution of test accuracy across K-fold cross-validation for MNIST. The width of each violin indicates the density of results; wider sections reflect more frequent accuracy values, helping visualize consistency and variability in model performance.
  • Figure 4: Violin plots show the distribution of test accuracy across K-fold cross-validation for FashionMNIST. The width of each violin indicates the density of results; wider sections reflect more frequent accuracy values, helping visualize consistency and variability in model performance.
  • Figure 5: Comparison of total execution time and test accuracy for different QSVM models for MNIST. The x-axis represents the average test accuracy across K-folds, while the y-axis (log scale) shows the total runtime in seconds. Each point corresponds to a model variant, with horizontal and vertical lines indicating the standard deviation of accuracy and time, respectively.
  • ...and 3 more figures