Table of Contents
Fetching ...

Exploiting the equivalence between quantum neural networks and perceptrons

Chris Mingard, Jessica Pointing, Charles London, Yoonsoo Nam, Ard A. Louis

TL;DR

The paper analyzes the expressivity of quantum neural networks (QNNs) by exploiting an exact mapping to a tensor-product perceptron (TPP) that acts on $h = x\otimes x$ (or $x\circ x$ for complex inputs), revealing strong encoding-dependent inductive biases and parity limitations that hinder QNNs as general-purpose learners on classical data. It demonstrates that training QNNs can be circumvented via the TPP mapping, enabling systematic study of function classes on the Boolean dataset and showing that amplitude encoding imposes severe expressivity constraints, while some encodings yield varying biases. The authors propose two pathways to go beyond standard QNNs: using QNNs to compute inner products for classical neural network kernels, and constructing layered non-linear quantum neural networks (DQNNs) with universal approximation capabilities, which exhibit improved inductive bias albeit with computational costs. Overall, the work cautions against claims of quantum advantage on classical data, highlights the potential of quantum approaches for quantum data or hybrid schemes, and underscores the need for careful, transparent benchmarking and scalable architectures.

Abstract

Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs $x$ to classical perceptrons acting on $x \otimes x$ (generalised to complex inputs). The simplicity of the perceptron architecture allows us to provide clear examples of the shortcomings of current QNN models, and the many barriers they face to becoming useful general-purpose learning algorithms. For example, a QNN with amplitude encoding cannot express the Boolean parity function for $n\geq 3$, which is but one of an exponential number of data structures that such a QNN is unable to express. Mapping a QNN to a classical perceptron simplifies training, allowing us to systematically study the inductive biases of other, more expressive embeddings on Boolean data. Several popular embeddings primarily produce an inductive bias towards functions with low class balance, reducing their generalisation performance compared to deep neural network architectures which exhibit much richer inductive biases. We explore two alternate strategies that move beyond standard QNNs. In the first, we use a QNN to help generate a classical DNN-inspired kernel. In the second we draw an analogy to the hierarchical structure of deep neural networks and construct a layered non-linear QNN that is provably fully expressive on Boolean data, while also exhibiting a richer inductive bias than simple QNNs. Finally, we discuss characteristics of the QNN literature that may obscure how hard it is to achieve quantum advantage over deep learning algorithms on classical data.

Exploiting the equivalence between quantum neural networks and perceptrons

TL;DR

The paper analyzes the expressivity of quantum neural networks (QNNs) by exploiting an exact mapping to a tensor-product perceptron (TPP) that acts on (or for complex inputs), revealing strong encoding-dependent inductive biases and parity limitations that hinder QNNs as general-purpose learners on classical data. It demonstrates that training QNNs can be circumvented via the TPP mapping, enabling systematic study of function classes on the Boolean dataset and showing that amplitude encoding imposes severe expressivity constraints, while some encodings yield varying biases. The authors propose two pathways to go beyond standard QNNs: using QNNs to compute inner products for classical neural network kernels, and constructing layered non-linear quantum neural networks (DQNNs) with universal approximation capabilities, which exhibit improved inductive bias albeit with computational costs. Overall, the work cautions against claims of quantum advantage on classical data, highlights the potential of quantum approaches for quantum data or hybrid schemes, and underscores the need for careful, transparent benchmarking and scalable architectures.

Abstract

Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs to classical perceptrons acting on (generalised to complex inputs). The simplicity of the perceptron architecture allows us to provide clear examples of the shortcomings of current QNN models, and the many barriers they face to becoming useful general-purpose learning algorithms. For example, a QNN with amplitude encoding cannot express the Boolean parity function for , which is but one of an exponential number of data structures that such a QNN is unable to express. Mapping a QNN to a classical perceptron simplifies training, allowing us to systematically study the inductive biases of other, more expressive embeddings on Boolean data. Several popular embeddings primarily produce an inductive bias towards functions with low class balance, reducing their generalisation performance compared to deep neural network architectures which exhibit much richer inductive biases. We explore two alternate strategies that move beyond standard QNNs. In the first, we use a QNN to help generate a classical DNN-inspired kernel. In the second we draw an analogy to the hierarchical structure of deep neural networks and construct a layered non-linear QNN that is provably fully expressive on Boolean data, while also exhibiting a richer inductive bias than simple QNNs. Finally, we discuss characteristics of the QNN literature that may obscure how hard it is to achieve quantum advantage over deep learning algorithms on classical data.
Paper Structure (31 sections, 15 theorems, 46 equations, 18 figures)

This paper contains 31 sections, 15 theorems, 46 equations, 18 figures.

Key Result

Lemma E.1

A QNN acting on an input Hilbert space $\mathbb{C}^{2^n}$ has ${(2^n)}^2$ independent parameters

Figures (18)

  • Figure 1: Schematic of a QNN used for binary classification. There are $n$ data input qubits encoded in state $|x\rangle$, one $|0\rangle$ initialised readout qubit, and an arbitrary classically parameterised unitary $U(\theta)$. For non-binary data, more readout qubits are required.
  • Figure 2: Test error v.s. LZ complexity and class balance for different encoding methods & algorithms for supervised learning on the $n=7$ Boolean dataset. The first five columns show QNNs with different encoding methods and the two final columns show two classical neural networks -- the perceptron and a 1-hidden layer FCN. The bracketed values after each encoding type give the dimension of the classical (quantum) input (Hilbert) space. The top (bottom) row shows generalisation error v.s. LZ complexity (class balance). Class balance is the minimum proportion of 0s or 1s in the function. Each datapoint is for one of 100 target functions chosen to have a wide range of entropies and LZ complexities. The perceptron can fit 7 functions and the QNN with amplitude encoding can fit 41 (neither can fit the parity function), while the QNN with RT(n) encoding can only fit the trivial function. All other learning algorithms can express all 100 target functions. The QNN with basis encoding has no inductive bias, and the QNN with either RT encoding has a simple bias towards low class balance functions, performing poorly on functions with high class balance and low complexity. ZZ encoding has a similar inductive bias, except it achieves a perfect 0% test error on the parity function, for which it has been engineered. The QNN with amplitude encoding and the perceptron are very similar in their inductive biases towards low class balance and towards low LZ complexity. The 1-hidden layer FCN is even more biased towards simple (low LZ complexity) functions, and is fully expressive -- a clear upgrade on both fronts over the perceptron and the QNNs.
  • Figure 3: A DQNN used for binary classification. There are $n$ data input qubits and $p$$\ket{0}$ initialised intermediate qubits in the first layer. The measured output of the first layer $\langle r \rangle$ is then re-encoded into a quantum state $\ket{r}$ (using an encoding method $\phi$), and passed through a final layer with one $\ket{0}$ initialised output qubit.
  • Figure 4: Test error v.s. complexity and class-balance for DQNNs and $K_Q^1$ for the $n=7$ Boolean dataset The top (bottom) row shows generalisation error v.s. LZ complexity (class balance). Each datapoint is for one of the 100 target functions used in \ref{['fig:qnn_posterior']}. All three algorithms receive data with amplitude encoding and are fully expressive. The DQNN-$\alpha$ (with $q=6$ intermediate qubits) has the most similar inductive bias to the kernel $K_Q^1$ and the finite-width FCN (see \ref{['fig:qnn_posterior']}). The inductive bias of DQNN-$\beta$ ($q=7$) is weaker than DQNN-$\alpha$ due to the basis encoding used after the first layer.
  • Figure 5: Generalisation error for different architectures on the simplified $8$-component Q-FashionMNIST dataset, amplitude encoded on 3 qubits. The QNN (sim) and QNN (TPP) are amplitude-encoded QNNs simulated with PyTorch and the TPP classical mapping respectively until their train accuracy no longer improves. The QNN (sim) does not fully converge due to barren plateaus and so has a higher training error. As in \ref{['fig:qnn_posterior']}, the DQNN-$\alpha$ outperformed the DQNN-$\beta$ . $K_Q^1$ performs better than the FCN. The simple perceptron converges to a higher training error than the TPP, but a lower test error because it has a better inductive bias for this problem.
  • ...and 13 more figures

Theorems & Definitions (38)

  • Definition C.1
  • Definition C.2: The standard representation of $\{0,1\}^n$
  • Definition C.3: The string representation of $\{0,1\}^n$
  • Definition C.4: Basis encoding
  • Definition C.5: Amplitude encoding
  • Definition C.6: ZZ encoding
  • Definition C.7: Random Transform (RT) encoding
  • Lemma E.1: Parameter counting in QNNs
  • proof
  • Lemma E.2
  • ...and 28 more