Table of Contents
Fetching ...

Enhancing Expressivity of Quantum Neural Networks Based on the SWAP test

Sebastian Nagies, Emiliano Tolotti, Davide Pastorello, Enrico Blanzieri

TL;DR

This work establishes a framework for analyzing and enhancing QNN expressivity through correspondence with classical architectures, and demonstrates that SWAP test-based QNNs possess broad representational capacity relevant to both classical and potentially quantum learning tasks.

Abstract

Quantum neural networks (QNNs) based on parametrized quantum circuits are promising candidates for machine learning applications, yet many architectures lack clear connections to classical models, potentially limiting their ability to leverage established classical neural network techniques. We examine QNNs built from SWAP test circuits and discuss their equivalence to classical two-layer feedforward networks with quadratic activations under amplitude encoding. Evaluation on real-world and synthetic datasets shows that while this architecture learns many practical binary classification tasks, it has fundamental expressivity limitations: polynomial activation functions do not satisfy the universal approximation theorem, and we show analytically that the architecture cannot learn the parity check function beyond two dimensions, regardless of network size. To address this, we introduce generalized SWAP test circuits with multiple Fredkin gates sharing an ancilla, implementing product layers with polynomial activations of arbitrary even degree. This modification enables successful learning of parity check functions in arbitrary dimensions as well as binary n-spiral tasks, and we provide numerical evidence that the expressivity enhancement extends to alternative encoding schemes such as angle (Z) and ZZ feature maps. We validate the practical feasibility of our proposed architecture by implementing a classically pretrained instance on the IBM Torino quantum processor, achieving 84% classification accuracy on the three-dimensional parity check despite hardware noise. Our work establishes a framework for analyzing and enhancing QNN expressivity through correspondence with classical architectures, and demonstrates that SWAP test-based QNNs possess broad representational capacity relevant to both classical and potentially quantum learning tasks.

Enhancing Expressivity of Quantum Neural Networks Based on the SWAP test

TL;DR

This work establishes a framework for analyzing and enhancing QNN expressivity through correspondence with classical architectures, and demonstrates that SWAP test-based QNNs possess broad representational capacity relevant to both classical and potentially quantum learning tasks.

Abstract

Quantum neural networks (QNNs) based on parametrized quantum circuits are promising candidates for machine learning applications, yet many architectures lack clear connections to classical models, potentially limiting their ability to leverage established classical neural network techniques. We examine QNNs built from SWAP test circuits and discuss their equivalence to classical two-layer feedforward networks with quadratic activations under amplitude encoding. Evaluation on real-world and synthetic datasets shows that while this architecture learns many practical binary classification tasks, it has fundamental expressivity limitations: polynomial activation functions do not satisfy the universal approximation theorem, and we show analytically that the architecture cannot learn the parity check function beyond two dimensions, regardless of network size. To address this, we introduce generalized SWAP test circuits with multiple Fredkin gates sharing an ancilla, implementing product layers with polynomial activations of arbitrary even degree. This modification enables successful learning of parity check functions in arbitrary dimensions as well as binary n-spiral tasks, and we provide numerical evidence that the expressivity enhancement extends to alternative encoding schemes such as angle (Z) and ZZ feature maps. We validate the practical feasibility of our proposed architecture by implementing a classically pretrained instance on the IBM Torino quantum processor, achieving 84% classification accuracy on the three-dimensional parity check despite hardware noise. Our work establishes a framework for analyzing and enhancing QNN expressivity through correspondence with classical architectures, and demonstrates that SWAP test-based QNNs possess broad representational capacity relevant to both classical and potentially quantum learning tasks.

Paper Structure

This paper contains 22 sections, 29 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Quantum circuit implementing the SWAP test which estimates the overlap between two quantum states $\ket{\psi}$ and $\ket{\phi}$ by measuring an ancilla qubit initialized to state $\ket{0}$. Both states are represented on quantum registers with $\delta$ qubits. If the two quantum states encode classical inputs and weights via amplitude encoding in a quantum perceptron context, the number of required qubits is $\delta = \lceil\log_2{d}\rceil$, where $d$ is the dimension of the classical input and weights vectors.
  • Figure 2: Generalization of the SWAP test to a product module. $w_{ij}$ is the weight vector in product module $i$, with factor index $j$. $\delta = \lceil\log_2{(d+1)}\rceil$ is the number of qubits needed to amplitude encode the input and weight vectors $\bm{x}$ and $\bm{w}_{ij}$ of dimension $d$ and $d+1$ respectively (assuming one of the weights acts as a bias). The overall product module can be seen as a number of $k$ SWAP tests (see Fig. \ref{['fig:swap_module']}) being executed using the same ancilla qubit. After measuring said ancilla, the probability $P(0)_i$ corresponds to a polynomial activation function of degree $2k$ (see Eq. \ref{['eq:product_layer']}).
  • Figure 3: Accuracy and F1 score distribution on the real-world datasets with the PyTorch implementation of the QNN with product layer (see Eq. \ref{['eq:product_layer']}), for increasing number of product modules $N$ and factor modules $k$. Each boxplot contains 21 points, each one being the mean value across different folds for each dataset (see Sec. \ref{['subsec:training']}). Horizontal lines represent the median for all datasets and triangle markers indicate mean values.
  • Figure 4: F1 scores for learning the IJCNN1 dataset with our QNN architecture (Eq. \ref{['eq:product_layer']}). Points represent mean values across different folds, and error bars represent 95% confidence intervals.
  • Figure 5: Accuracy results for the parity check data set with the PyTorch implementation of the proposed QNN architecture (Eq. \ref{['eq:product_layer']}). $d$ is the dimension of the parity check data set, $k$ is the number of factor modules in each of the $N$ modules. We generated 1000 samples in each of the $2^d$ respective decision regions. Each accuracy point is the maximum achieved accuracy on the test set, obtained by training the network for $50000$ epochs, for learning rates in $\{0.01,0.1,1,10\}$.
  • ...and 2 more figures