Prospects of Privacy Advantage in Quantum Machine Learning

Jamie Heredge; Niraj Kumar; Dylan Herman; Shouvanik Chakrabarti; Romina Yalovetzky; Shree Hari Sureshbabu; Changhao Li; Marco Pistoia

Prospects of Privacy Advantage in Quantum Machine Learning

Jamie Heredge, Niraj Kumar, Dylan Herman, Shouvanik Chakrabarti, Romina Yalovetzky, Shree Hari Sureshbabu, Changhao Li, Marco Pistoia

TL;DR

This work investigates the privacy of input data in variational quantum circuits (VQCs) by analyzing how gradients reveal information about the input. By introducing a Lie-theoretic framework centered on the dynamical Lie algebra (DLA) and the Lie Algebraic Simulation (g-sim) method, it distinguishes between weak privacy (snapshot recovery) and strong privacy (full input inversion), and shows that a polynomial-sized DLA enables efficient snapshot recovery. It further analyzes local Pauli and general Pauli encodings, showing that snapshot inversion can be efficient for several local encodings but becomes intractable for black-box generic encodings unless certain structure is exploited. The results reveal a fundamental trade-off: achieving trainability with LASA circuits tends to correlate with potential privacy leakage under polynomial DLAs, while robust privacy must rely on encoding architectures that resist snapshot inversion, offering guidance for designing privacy-preserving quantum machine learning models. Overall, the paper provides a rigorous framework to assess quantum privacy in VQCs and highlights design directions to balance privacy with trainability in quantum federated learning contexts.

Abstract

Ensuring data privacy in machine learning models is critical, particularly in distributed settings where model gradients are typically shared among multiple parties to allow collaborative learning. Motivated by the increasing success of recovering input data from the gradients of classical models, this study addresses a central question: How hard is it to recover the input data from the gradients of quantum machine learning models? Focusing on variational quantum circuits (VQC) as learning models, we uncover the crucial role played by the dynamical Lie algebra (DLA) of the VQC ansatz in determining privacy vulnerabilities. While the DLA has previously been linked to the classical simulatability and trainability of VQC models, this work, for the first time, establishes its connection to the privacy of VQC models. In particular, we show that properties conducive to the trainability of VQCs, such as a polynomial-sized DLA, also facilitate the extraction of detailed snapshots of the input. We term this a weak privacy breach, as the snapshots enable training VQC models for distinct learning tasks without direct access to the original input. Further, we investigate the conditions for a strong privacy breach where the original input data can be recovered from these snapshots by classical or quantum-assisted polynomial time methods. We establish conditions on the encoding map such as classical simulatability, overlap with DLA basis, and its Fourier frequency characteristics that enable such a privacy breach of VQC models. Our findings thus play a crucial role in detailing the prospects of quantum privacy advantage by guiding the requirements for designing quantum machine learning models that balance trainability with robust privacy protection.

Prospects of Privacy Advantage in Quantum Machine Learning

TL;DR

Abstract

Paper Structure (26 sections, 6 theorems, 70 equations, 8 figures, 2 tables, 4 algorithms)

This paper contains 26 sections, 6 theorems, 70 equations, 8 figures, 2 tables, 4 algorithms.

Introduction
General Framework
Variational Quantum Circuits for Machine Learning
Lie Theoretic Framework
Input Recoverability Definitions
Snapshot Recovery
Review of Lie-Algebraic Simulation Framework
Snapshot Recovery Algorithm
Snapshot Invertibility
Snapshot Inversion for Local Encodings
Pauli Product Encoding
General Pauli Encoding
Snapshot Inversion for Generic Encodings
Direct Input Recovery
Expectation Value Landscape Numerical Results
...and 11 more sections

Key Result

Theorem 1

If ansatz family $\mathbf{U}(\boldsymbol{\theta})$ with an observable $\mathbf{O}$ satisfies both the LASA condition and Slow Pauli Expansion, then the cost function and its gradients can be simulated with complexity $\mathcal{O}(\text{poly}(\dim (\mathfrak{g})))$ using a procedure that at most quer

Figures (8)

Figure 1: Overview of the general framework and definitions. Weak privacy breach corresponds to attacks where snapshots of the data are retrieved. These can be used as inputs to other models, without explicitly needing the exact data, allowing one to potentially learn characteristics of the data. If these snapshots can then be further inverted to retrieve the input data $\mathbf{x}$ explicitly, we say the attack has succeeded in a strong privacy breach.
Figure 2: a) Visualization of the difference between the circuit implementation of a variational quantum model and a Lie algebraic simulation procedure of the same model goh2023lie. In the upper circuit, a VQC works by encoding the input data $\mathbf{x}$ into a quantum circuit using the encoding step $V(\mathbf{x})$, which is then passed through a variational circuit $U(\theta)$. After this some measurement $O$ is taken of the quantum state in order to calculate the model output $y_\theta$. In the lower circuit, the Lie Algebraic Simulation framework goh2023lie is shown; similarly, input data $\mathbf{x}$ is encoded into a quantum circuit using the encoding step $V(\mathbf{x})$, however, the measurements are then performed on this encoded state and used to form a vector of snapshot expectation values. This vector of snapshot expectation values can then be passed as inputs to a classical simulator that uses the adjoint form of $U(\theta)$, which can be performed with resources scaling with the dimension of the DLA formed by the generators of $U(\theta)$. b) In this work, we assess the ability to recover an input $\mathbf{x}$ from gradients $C_j$. This can be broken into two parts: Firstly, the snapshot $\mathbf{e}_{\text{snap}}$ must be recovered from the gradients $C_j$, which corresponds to reversing the Lie Algebraic simulation step. Secondly, the recovered snapshot $\mathbf{e}_{\text{snap}}$ must be inverted to find the original data $\mathbf{x}$, which requires finding the values of $\mathbf{x}$ that when input into $V(\mathbf{x})$ will give the same snapshot values $\mathbf{e}_{\text{snap}}$. If both snapshot recovery and snapshot inversion can be performed, then it admits efficient input recovery.
Figure 3: A product map encoding, whereby each input variable $x_j$ is encoded into an individual qubit, and the snapshot used by the model corresponds to single qubit measurements of the DLA basis elements. In this setting, the snapshot is trivial to invert and find the original data using the relation $x_j = \cos^{-1}\left(2\mathbf{\gamma}^{(j)}\cdot \mathbf{e}_{\text{snap}}\right)$.
Figure 4: Encoding circuit diagram showing a single qubit $R_{\mathsf{X}}$ rotation gate parameterised by the univariate parameter $x$, but with arbitrary $2^n$ dimensional unitaries applied before and after the $x$ parameterized gate. Despite being hard to simulate analytically, the expectation value $e_{\text{in}}$ varies as a simple sinusoidal function in $x$, regardless of the total number of qubits $n$.
Figure 5: Encoding circuit diagram showing a $\text{SU}(2^n)$ gate parameterised by a univariate parameter $x$.
...and 3 more figures

Theorems & Definitions (21)

Definition 1: Dynamical Lie Algebra
Definition 2: Dynamical Lie Group
Definition 3: Adjoint representation
Definition 4: DLA basis
Definition 5: Lie Algebra Supported Ansatz fontana2023adjoint
Definition 6: Snapshot Recovery
Definition 7: Classically Snapshot Invertible Model
Definition 8: Quantum Assisted Snapshot inversion
Definition 9: Slow Pauli Expansion
Theorem 1: Complexity of $\mathfrak{g}$-sim
...and 11 more

Prospects of Privacy Advantage in Quantum Machine Learning

TL;DR

Abstract

Prospects of Privacy Advantage in Quantum Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (21)