Generating Universal Adversarial Perturbations for Quantum Classifiers

Gautham Anil; Vishnu Vinod; Apurva Narayan

Generating Universal Adversarial Perturbations for Quantum Classifiers

Gautham Anil, Vishnu Vinod, Apurva Narayan

TL;DR

This work conceptualize the notion of additive UAPs for PQC-based classifiers and theoretically demonstrate their existence, and formulate a new method for generating unitary UAPs (QuGAP-U) using quantum generative models and a novel loss function based on fidelity constraints.

Abstract

Quantum Machine Learning (QML) has emerged as a promising field of research, aiming to leverage the capabilities of quantum computing to enhance existing machine learning methodologies. Recent studies have revealed that, like their classical counterparts, QML models based on Parametrized Quantum Circuits (PQCs) are also vulnerable to adversarial attacks. Moreover, the existence of Universal Adversarial Perturbations (UAPs) in the quantum domain has been demonstrated theoretically in the context of quantum classifiers. In this work, we introduce QuGAP: a novel framework for generating UAPs for quantum classifiers. We conceptualize the notion of additive UAPs for PQC-based classifiers and theoretically demonstrate their existence. We then utilize generative models (QuGAP-A) to craft additive UAPs and experimentally show that quantum classifiers are susceptible to such attacks. Moreover, we formulate a new method for generating unitary UAPs (QuGAP-U) using quantum generative models and a novel loss function based on fidelity constraints. We evaluate the performance of the proposed framework and show that our method achieves state-of-the-art misclassification rates, while maintaining high fidelity between legitimate and adversarial samples.

Generating Universal Adversarial Perturbations for Quantum Classifiers

TL;DR

Abstract

Paper Structure (60 sections, 6 theorems, 127 equations, 13 figures, 1 table, 1 algorithm)

This paper contains 60 sections, 6 theorems, 127 equations, 13 figures, 1 table, 1 algorithm.

Introduction
Related Work
Adversarial Attack on Neural Networks
Adversarial Attack on PQC-Based Classifiers
Background
Quantum Classifiers
Classical UAPs
Quantum UAPs
Encoding Schemes
Adversarial Loss
Additive UAPs
Motivation
Existence of Additive UAPs
Generative Framework
Experimental Results
...and 45 more sections

Key Result

Lemma 1

The probability of an input $x$ being classified as class $c$ is given by $P(\hat{c}_x = c) = x^\dagger M^c x$ where, $x^\dagger$ denotes the conjugate transpose of $x$ and $M^c\in\mathbb{C}^{d\times d}$ is a positive semi-definite matrix given by: $M^{c}_{ij} = \sum_{t=0}^{d-1} \;U_{k't+c,k'i}^* \;

Figures (13)

Figure 1: QuGAP-A: A framework for generating additive UAPs for quantum classifiers. A random vector $z$ sampled from $\mathbb{R}^m$ is passed through a classical generative network. The generated perturbation $z'$ is then scaled to impose the norm constraint and then added to an input sample $x$. The perturbed input sample $x'$ is then amplitude-encoded and passed through the trained quantum classifier $\mathcal{Q}$. Output predictions from $\mathcal{Q}$ are then used to compute the fooling loss $\mathcal{L}_{fool}$. Gradients computed are backpropagated to update the generator parameters. The process is repeated for all input samples over multiple epochs.
Figure 2: The misclassification rates for $16\times16$ MNIST and FMNIST using additive untargeted UAPs. We report results for binary classification between classes $0$ and $1$ and 4-class classification between classes $0,1,2$ and $3$.
Figure 3: QuGAP-U: A framework for generating unitary UAPs for quantum classifiers. The quantum generator $\mathcal{G_Q}$ takes in an input state $\ket{\psi_{i}}$ and transforms it into a perturbed state $\ket{\phi_{i}} = U_\mathcal{G}\ket{\psi_{i}}$. The fidelity between $\ket{\psi_{i}}$ and $\ket{\phi_{i}}$ is computed from which $\mathcal{L}_{fid}$ is calculated. $\ket{\phi_{i}}$ is also passed through a trained quantum classifier $\mathcal{Q}$ to compute $\mathcal{L}_{fool}$. Gradients are computed using the total loss $\mathcal{L}_{fool} + \alpha \mathcal{L}_{fid}$ and used to update the parameters of $\mathcal{G_Q}$ over all training samples for multiple epochs
Figure 4: Classical simulation of a framework for generating unitary UAPs. Misclassification rates for TIM binary classification are given in (a), and for 8x8 downsampled MNIST binary classification are given in (b). Achieving competitive misclassification rates for MNIST results in much lower attack fidelities.
Figure 5: Misclassification rate evolution of QuGAP-U on classifiers with depths 10(Q10) and 20(Q20). (a) TIM dataset; (b) $8\times 8$ downsampled MNIST dataset
...and 8 more figures

Theorems & Definitions (6)

Lemma 1
Lemma 2
Lemma 3
Theorem 1
Lemma 4
Theorem 2

Generating Universal Adversarial Perturbations for Quantum Classifiers

TL;DR

Abstract

Generating Universal Adversarial Perturbations for Quantum Classifiers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (6)