Table of Contents
Fetching ...

Adversarial Quantum Machine Learning: An Information-Theoretic Generalization Analysis

Petros Georgiou, Sharu Theresa Jose, Osvaldo Simeone

TL;DR

This work develops information-theoretic generalization bounds for adversarially trained quantum classifiers under $p$-Schatten perturbations with budget $\epsilon$. Building on a prior non-adversarial bound based on the 2-Rényi mutual information $I_2(X:Q)$, the authors derive two separate bounds for $p=1$ and $p=\infty$, each consisting of the original bound plus a term linear in $\epsilon$ that scales with the Hilbert-space dimension, both decaying as $1/\sqrt{T}$. They further analyze training-test mismatch, showing that stronger adversarial training can mitigate mismatches, quantified by an additive term $\xi=d^{(1-1/p')}\epsilon'+d^{(1-1/p)}\epsilon$. The paper also validates the theory with synthetic experiments and explores a noise-aware adversarial training setting, including the beneficial effect of depolarizing noise on generalization. The results offer principled guidance for designing robust quantum classifiers under adversarial perturbations in practical, noisy quantum environments.

Abstract

In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box attacks. Specifically, a quantum adversary maximizes the classifier's loss by transforming an input state $ρ(x)$ into a state $λ$ that is $ε$-close to the original state $ρ(x)$ in $p$-Schatten distance. Under suitable assumptions on the quantum embedding $ρ(x)$, we derive novel information-theoretic upper bounds on the generalization error of adversarially trained quantum classifiers for $p = 1$ and $p = \infty$. The derived upper bounds consist of two terms: the first is an exponential function of the 2-Rényi mutual information between classical data and quantum embedding, while the second term scales linearly with the adversarial perturbation size $ε$. Both terms are shown to decrease as $1/\sqrt{T}$ over the training set size $T$ . An extension is also considered in which the adversary assumed during training has different parameters $p$ and $ε$ as compared to the adversary affecting the test inputs. Finally, we validate our theoretical findings with numerical experiments for a synthetic setting.

Adversarial Quantum Machine Learning: An Information-Theoretic Generalization Analysis

TL;DR

This work develops information-theoretic generalization bounds for adversarially trained quantum classifiers under -Schatten perturbations with budget . Building on a prior non-adversarial bound based on the 2-Rényi mutual information , the authors derive two separate bounds for and , each consisting of the original bound plus a term linear in that scales with the Hilbert-space dimension, both decaying as . They further analyze training-test mismatch, showing that stronger adversarial training can mitigate mismatches, quantified by an additive term . The paper also validates the theory with synthetic experiments and explores a noise-aware adversarial training setting, including the beneficial effect of depolarizing noise on generalization. The results offer principled guidance for designing robust quantum classifiers under adversarial perturbations in practical, noisy quantum environments.

Abstract

In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box attacks. Specifically, a quantum adversary maximizes the classifier's loss by transforming an input state into a state that is -close to the original state in -Schatten distance. Under suitable assumptions on the quantum embedding , we derive novel information-theoretic upper bounds on the generalization error of adversarially trained quantum classifiers for and . The derived upper bounds consist of two terms: the first is an exponential function of the 2-Rényi mutual information between classical data and quantum embedding, while the second term scales linearly with the adversarial perturbation size . Both terms are shown to decrease as over the training set size . An extension is also considered in which the adversary assumed during training has different parameters and as compared to the adversary affecting the test inputs. Finally, we validate our theoretical findings with numerical experiments for a synthetic setting.
Paper Structure (20 sections, 7 theorems, 74 equations, 2 figures)

This paper contains 20 sections, 7 theorems, 74 equations, 2 figures.

Key Result

Theorem 1

For any POVM $\Pi \in \mathcal{M}$, the following upper bound on the generalization error $\mathcal{G}(\Pi,\mathcal{T})$ holds with probability at least $1-\delta$, for $\delta \in (0,1)$, with respect to random draws of of the training set $\mathcal{T}$, where $I_2(X:Q)$ denotes the 2-Renyi mutual information between the quantum state space $Q$ and the classical feature space $X$ under state $\r

Figures (2)

  • Figure 1: Quantum classification in $(a)$ a non-adversarial setting, in which the quantum measurement $\Pi$ acts on the unperturbed quantum embedding $\rho(x)$; $(b)$ an adversarial setting, in which the state $\rho(x)$ is perturbed by a quantum adversary to yield a state $\lambda$.
  • Figure 2: True generalization errors for non-adversarial ($\mathcal{G}(\Pi,\mathcal{T})$) and adversarial ($\mathcal{G}_{1,\epsilon}(\Pi,\mathcal{T})$) settings, compared with numerically evaluated uniform deviation bounds \ref{['eq:nonadversarial_udb']} and \ref{['eq:adversarial_udb']} and derived bounds as a function of the training set size $T$: (left) $\epsilon=0.08 \leq 2 \Delta=0.1$, and (right) $\epsilon=0.12 >2\Delta$.

Theorems & Definitions (8)

  • Theorem 1: Banchi et. al banchi2021generalization
  • Lemma 1: Shalev-Schwartz and Ben Davidshalev2014understanding
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Definition 1
  • Lemma 2
  • Theorem 5