Critical Evaluation of Quantum Machine Learning for Adversarial Robustness
Saeefa Rubaiyet Nowmi, Jesus Lopez, Md Mahmudul Alam Imon, Shahrooz Pouryousef, Mohammad Saidur Rahman
TL;DR
This work delivers a systematic framework for evaluating adversarial robustness in quantum machine learning (QML) and provides an empirical study across black-box, gray-box, and white-box threat models using quantum neural networks. It analyzes how data encoding schemes (angle vs amplitude) and circuit depth interact with Noise in NISQ devices to shape resilience, revealing that amplitude encoding excels in noiseless, deep regimes while angle encoding offers greater robustness in shallow, noisy regimes. The study demonstrates that QMLP models are surprisingly robust to classical label-flipping but are vulnerable to gradient-based evasion and encoder-level attacks like QUID, with noise offering a partial natural defense. The findings underscore the need for quantum-native defenses, hardware-aware circuit designs, and threat-aware pipelines to enable secure, real-world deployment of QML systems on NISQ-era hardware.
Abstract
Quantum Machine Learning (QML) integrates quantum computational principles into learning algorithms, offering improved representational capacity and computational efficiency. Nevertheless, the security and robustness of QML systems remain underexplored, especially under adversarial conditions. In this paper, we present a systematization of adversarial robustness in QML, integrating conceptual organization with empirical evaluation across three threat models-black-box, gray-box, and white-box. We implement representative attacks in each category, including label-flipping for black-box, QUID encoder-level data poisoning for gray-box, and FGSM and PGD for white-box, using Quantum Neural Networks (QNNs) trained on two datasets from distinct domains: MNIST from computer vision and AZ-Class from Android malware, across multiple circuit depths (2, 5, 10, and 50 layers) and two encoding schemes (angle and amplitude). Our evaluation shows that amplitude encoding yields the highest clean accuracy (93% on MNIST and 67% on AZ-Class) in deep, noiseless circuits; however, it degrades sharply under adversarial perturbations and depolarization noise (p=0.01), dropping accuracy below 5%. In contrast, angle encoding, while offering lower representational capacity, remains more stable in shallow, noisy regimes, revealing a trade-off between capacity and robustness. Moreover, the QUID attack attains higher attack success rates, though quantum noise channels disrupt the Hilbert-space correlations it exploits, weakening its impact in image domains. This suggests that noise can act as a natural defense mechanism in Noisy Intermediate-Scale Quantum (NISQ) systems. Overall, our findings guide the development of secure and resilient QML architectures for practical deployment. These insights underscore the importance of designing threat-aware models that remain reliable under real-world noise in NISQ settings.
