Training quantum machine learning models on cloud without uploading the data

Guang Ping He

Training quantum machine learning models on cloud without uploading the data

Guang Ping He

TL;DR

This work tackles privacy concerns in training quantum ML models on cloud platforms by introducing a run-before-encoding scheme that exploits the linearity of quantum unitaries. The method precomputes a matrix $B$ on the cloud using basis inputs, allowing the data owner to compute $p_m(x)$ and the final cost entirely on a classical device after encoding decisions are made, thereby keeping the dataset secret. A central insight is $oldsymbol{a}(x)=Boldsymbol{f}(x)$, with $p_m(x)= vert a_m(x) vert^2$ and a sign-deduction procedure enabling reconstruction of amplitudes from cloud outputs. Experimentally, the approach reduces quantum-device time and accelerates training for large datasets, while also mitigating the encoding bottleneck (scaling closer to $O(n)$ gates on average) and enabling a classical shadow model via the fixed matrix $B$. Collectively, the work demonstrates a practical, privacy-preserving pathway for quantum-cloud ML and points toward quantum-inspired classical equivalents that preserve the nonlinear benefits of measurement at the end.

Abstract

Based on the linearity of quantum unitary operations, we propose a method that runs the parameterized quantum circuits before encoding the input data. This enables a dataset owner to train machine learning models on quantum cloud computation platforms, without the risk of leaking the information about the data. It is also capable of encoding a vast amount of data effectively at a later time using classical computations, thus saving runtime on quantum computation devices. The trained quantum machine learning models can be run completely on classical computers, meaning the dataset owner does not need to have any quantum hardware, nor even quantum simulators. Moreover, our method mitigates the encoding bottleneck by reducing the required circuit depth from $O(2^{n})$ to $O(n)$, and relax the tolerance on the precision of the quantum gates for the encoding. These results demonstrate yet another advantage of quantum and quantum-inspired machine learning models over existing classical neural networks, and broaden the approaches to data security.

Training quantum machine learning models on cloud without uploading the data

TL;DR

on the cloud using basis inputs, allowing the data owner to compute

and the final cost entirely on a classical device after encoding decisions are made, thereby keeping the dataset secret. A central insight is

, with

and a sign-deduction procedure enabling reconstruction of amplitudes from cloud outputs. Experimentally, the approach reduces quantum-device time and accelerates training for large datasets, while also mitigating the encoding bottleneck (scaling closer to

gates on average) and enabling a classical shadow model via the fixed matrix

. Collectively, the work demonstrates a practical, privacy-preserving pathway for quantum-cloud ML and points toward quantum-inspired classical equivalents that preserve the nonlinear benefits of measurement at the end.

Abstract

, and relax the tolerance on the precision of the quantum gates for the encoding. These results demonstrate yet another advantage of quantum and quantum-inspired machine learning models over existing classical neural networks, and broaden the approaches to data security.

Paper Structure (7 sections, 27 equations, 3 figures)

This paper contains 7 sections, 27 equations, 3 figures.

Introduction
Typical features of quantum machine learning models
Our method
Advantages
Experimental results
Further improvements for better security
More discussion on the advantage

Figures (3)

Figure 1: A typical variational quantum circuit (VQC) with the RealAmplitudes ansatz. Showed with 2-repetition and full entanglement.
Figure 2: Flow chart of our run-before-encoding method.
Figure 3: Runtime of the VQC as a function of the number of input data, where the ansatz of the VQC contains (a) 1 repetition, and (b) 2 repetitions. The blues and orange lines are the results of our method and the old method, respectively. The solid lines are the time spent in total, while the dashed lines are the time spent on quantum devices.

Training quantum machine learning models on cloud without uploading the data

TL;DR

Abstract

Training quantum machine learning models on cloud without uploading the data

Authors

TL;DR

Abstract

Table of Contents

Figures (3)