On the Generalization of Adversarially Trained Quantum Classifiers
Petros Georgiou, Aaron Mark Thomas, Sharu Theresa Jose, Osvaldo Simeone
TL;DR
This work develops a learning-theoretic framework for adversarially trained quantum classifiers, deriving PAC-style bounds via adversarial Rademacher complexity (ARC). It shows the excess generalization cost due to adversarial training scales as $\mathcal{O}(\mathcal{S}_{r,p,\epsilon}^{Q/C}/\sqrt{m})$, with the scale and sign determined by the embedding type (rotation vs amplitude) and the attack (classical vs quantum). Rotation embeddings can preserve adversarial generalization in high dimensions, while amplitude embeddings incur dimension-dependent penalties, and quantum attacks make the bounds depend on the Hilbert-space dimension $d_H$ (with tighter results under noise constraints). The paper extends to multi-class settings, provides numerical validation, and outlines future directions including device noise, stability, and non-uniform convergence analyses.
Abstract
Quantum classifiers are vulnerable to adversarial attacks that manipulate their input classical or quantum data. A promising countermeasure is adversarial training, where quantum classifiers are trained by using an attack-aware, adversarial loss function. This work establishes novel bounds on the generalization error of adversarially trained quantum classifiers when tested in the presence of perturbation-constrained adversaries. The bounds quantify the excess generalization error incurred to ensure robustness to adversarial attacks as scaling with the training sample size $m$ as $1/\sqrt{m}$, while yielding insights into the impact of the quantum embedding. For quantum binary classifiers employing \textit{rotation embedding}, we find that, in the presence of adversarial attacks on classical inputs $\mathbf{x}$, the increase in sample complexity due to adversarial training over conventional training vanishes in the limit of high dimensional inputs $\mathbf{x}$. In contrast, when the adversary can directly attack the quantum state $ρ(\mathbf{x})$ encoding the input $\mathbf{x}$, the excess generalization error depends on the choice of embedding only through its Hilbert space dimension. The results are also extended to multi-class classifiers. We validate our theoretical findings with numerical experiments.
