Table of Contents
Fetching ...

Do Quantum Neural Networks have Simplicity Bias?

Jessica Pointing

TL;DR

This paper investigates whether Quantum Neural Networks (QNNs) possess a simplicity bias akin to Deep Neural Networks (DNNs) by analyzing inductive bias and expressivity across encoding methods on Boolean data. It demonstrates a bias–expressivity tradeoff: amplitude encoding can yield a simplicity bias but restricts expressivity (parity may become unexpressible for small $n$), while ZZ Feature Map and Random Relu provide high expressivity with a weak, often low-entropy, inductive bias. Basis encoding can be fully expressive with no intrinsic inductive bias, yet artificial bias emerges when expressivity is constrained. Overall, current QNN architectures may not outperform DNNs in generalisation for structured real-world data, highlighting the need to carefully balance encoding-induced bias and expressivity in QNN design.

Abstract

One hypothesis for the success of deep neural networks (DNNs) is that they are highly expressive, which enables them to be applied to many problems, and they have a strong inductive bias towards solutions that are simple, known as simplicity bias, which allows them to generalise well on unseen data because most real-world data is structured (i.e. simple). In this work, we explore the inductive bias and expressivity of quantum neural networks (QNNs), which gives us a way to compare their performance to those of DNNs. Our results show that it is possible to have simplicity bias with certain QNNs, but we prove that this type of QNN limits the expressivity of the QNN. We also show that it is possible to have QNNs with high expressivity, but they either have no inductive bias or a poor inductive bias and result in a worse generalisation performance compared to DNNs. We demonstrate that an artificial (restricted) inductive bias can be produced by intentionally restricting the expressivity of a QNN. Our results suggest a bias-expressivity tradeoff. Our conclusion is that the QNNs we studied can not generally offer an advantage over DNNs, because these QNNs either have a poor inductive bias or poor expressivity compared to DNNs.

Do Quantum Neural Networks have Simplicity Bias?

TL;DR

This paper investigates whether Quantum Neural Networks (QNNs) possess a simplicity bias akin to Deep Neural Networks (DNNs) by analyzing inductive bias and expressivity across encoding methods on Boolean data. It demonstrates a bias–expressivity tradeoff: amplitude encoding can yield a simplicity bias but restricts expressivity (parity may become unexpressible for small ), while ZZ Feature Map and Random Relu provide high expressivity with a weak, often low-entropy, inductive bias. Basis encoding can be fully expressive with no intrinsic inductive bias, yet artificial bias emerges when expressivity is constrained. Overall, current QNN architectures may not outperform DNNs in generalisation for structured real-world data, highlighting the need to carefully balance encoding-induced bias and expressivity in QNN design.

Abstract

One hypothesis for the success of deep neural networks (DNNs) is that they are highly expressive, which enables them to be applied to many problems, and they have a strong inductive bias towards solutions that are simple, known as simplicity bias, which allows them to generalise well on unseen data because most real-world data is structured (i.e. simple). In this work, we explore the inductive bias and expressivity of quantum neural networks (QNNs), which gives us a way to compare their performance to those of DNNs. Our results show that it is possible to have simplicity bias with certain QNNs, but we prove that this type of QNN limits the expressivity of the QNN. We also show that it is possible to have QNNs with high expressivity, but they either have no inductive bias or a poor inductive bias and result in a worse generalisation performance compared to DNNs. We demonstrate that an artificial (restricted) inductive bias can be produced by intentionally restricting the expressivity of a QNN. Our results suggest a bias-expressivity tradeoff. Our conclusion is that the QNNs we studied can not generally offer an advantage over DNNs, because these QNNs either have a poor inductive bias or poor expressivity compared to DNNs.
Paper Structure (67 sections, 23 equations, 37 figures, 10 tables)

This paper contains 67 sections, 23 equations, 37 figures, 10 tables.

Figures (37)

  • Figure 1: Quantum neural network with three distinct parts: the quantum neural network consists of the (1) Encoder circuit, which encodes the data into the QNN (2) Variational circuit, which is parametrised and has its parameters optimised (3) Measurement operators, which retrieve classical information from the QNN.
  • Figure 2: Learning process for a quantum neural network: the classical information retrieved from the measurement operators are fed into a classical optimiser to compute the loss, gradients, and update the parameters. These new parameters are updated in the parametrised quantum gates in the variational circuit and the process is repeated until the output of the QNN converges or the process satisfies a stopping condition.
  • Figure 3: ZZ feature map for two qubits: this feature map is used for encoding data into a QNN. The feature map for two qubits consists of Hadamard gates, followed by $U_1$ gates where the first datapoint $x_0$ is encoded in the parameter for the $U_1$ gate. A CNOT gate follows, then another $U_1$ gate on the second qubit, which encodes both datapoints $x_0$ and $x_1$ in the parameter for the $U_1$ gate. Another CNOT gate follows.
  • Figure 4: Probability of a boolean function $\mathbf{(f)}$ versus its complexity for five data qubits and $\mathbf{10^5}$ samples.$P(f)$ versus Lempel-Ziv complexity, $K$ for a fully expressive QNN with basis encoding (red), ZZ feature map (blue), random relu transform (pink), amplitude encoding (orange), a DNN (green), and a non-fully expressive QNN with basis encoding (grey). $P(f)$ is calculated by generating $10^5$ samples of functions from the QNN by using random samples of parameters $\Theta$ over a uniform distribution.
  • Figure 5: Generalisation error versus Lempel-Ziv (LZ) complexity (top figures) and entropy (bottom figures) for five data qubits for a fully expressive QNN with different encoding methods. The QNN is trained to zero training error on a training set $S$ of size $m = 16$ and the generalisation error is calculated on the remaining $16$ functions. The QNNs are trained by randomly sampling parameters $10^5$ times and finding the generalisation error on the set of parameters that obtain zero training error. Error bars are one standard deviation. The datapoints show the error on the test dataset for different target functions with different LZ complexties and entropies. The maximum entropy is 16, as the boolean functions have length $2^5 = 32$ and entropy is the minimum number of 0s or 1s in the function. The green datpoint is the parity function.
  • ...and 32 more figures