Classical Autoencoder Distillation of Quantum Adversarial Manipulations
Amena Khatun, Muhammad Usman
TL;DR
The paper addresses the vulnerability of quantum classifiers to adversarial perturbations, including quantum adversarial attacks, by proposing a defense based on distilling adversarial noise through a classical autoencoder. It introduces QVC-CED, a hybrid quantum-classical pipeline where a quantum variational classifier (QVC) is complemented by a classical encoder–decoder autoencoder that purifies adversarial inputs before reclassification. The authors demonstrate robust performance on MNIST and FMNIST under FGSM and PGD, achieving substantial accuracy recovery after reconstruction (e.g., MNIST around 80% and FMNIST around 65% at attack strength 0.3). This work provides a practical pathway toward robust QML in both classical and quantum adversarial contexts and motivates broader integration with quantum architectures and hardware-aware defenses.
Abstract
Quantum neural networks have been proven robust against classical adversarial attacks, but their vulnerability against quantum adversarial attacks is still a challenging problem. Here we report a new technique for the distillation of quantum manipulated image datasets by using classical autoencoders. Our technique recovers quantum classifier accuracies when tested under standard machine learning benchmarks utilising MNIST and FMNIST image datasets, and PGD and FGSM adversarial attack settings. Our work highlights a promising pathway to achieve fully robust quantum machine learning in both classical and quantum adversarial scenarios.
