Adversarial Robustness Guarantees for Quantum Classifiers
Neil Dowling, Maxwell T. West, Angus Southwell, Azar C. Nakhl, Martin Sevior, Muhammad Usman, Kavan Modi
TL;DR
This work provides provable robustness guarantees for quantum classifiers against adversarial tampering by connecting quantum dynamics to adversarial robustness. It analyzes three attack regimes—weak targeted, strong local, and universal—and shows how data encoding and dynamical complexity (OTOC scrambling and LOE chaos) govern robustness. The authors derive concrete theorems linking state-distances and output changes to encoding type and circuit properties, and they support these with numerical simulations using matrix product state methods. The findings suggest a concrete pathway to leverage quantum dynamics for adversarial robustness in QML, complementary to speed or accuracy improvements, and highlight how encoding choices interplay with circuit chaos to bolster security. In noisy settings, the results extend to CPTP maps, indicating resilience to practical imperfections, while also outlining future directions for trainability and active defense strategies.
Abstract
Despite their ever more widespread deployment throughout society, machine learning algorithms remain critically vulnerable to being spoofed by subtle adversarial tampering with their input data. The prospect of near-term quantum computers being capable of running {quantum machine learning} (QML) algorithms has therefore generated intense interest in their adversarial vulnerability. Here we show that quantum properties of QML algorithms can confer fundamental protections against such attacks, in certain scenarios guaranteeing robustness against classically-armed adversaries. We leverage tools from many-body physics to identify the quantum sources of this protection. Our results offer a theoretical underpinning of recent evidence which suggest quantum advantages in the search for adversarial robustness. In particular, we prove that quantum classifiers are: (i) protected against weak perturbations of data drawn from the trained distribution, (ii) protected against local attacks if they are insufficiently scrambling, and (iii) show evidence that they are protected against universal adversarial attacks if they are sufficiently chaotic. Our analytic results are supported by numerical evidence demonstrating the applicability of our theorems and the resulting robustness of a quantum classifier in practice. This line of inquiry constitutes a concrete pathway to advantage in QML, orthogonal to the usually sought improvements in model speed or accuracy.
