Table of Contents
Fetching ...

Quantum Pointwise Convolution: A Flexible and Scalable Approach for Neural Network Enhancement

An Ning, Tai-Yue Li, Nan-Yow Chen

TL;DR

The paper addresses the limitation of linear channel interactions in classical pointwise convolution by introducing Quantum Pointwise Convolution (QPC), a hybrid quantum-classical approach that uses amplitude encoding and strongly entangling circuit blocks to capture nonlinear cross-channel relationships. The method generates multiple feature maps per quantum kernel and leverages weight sharing to reduce parameter count, enabling end-to-end training with a parameter-shift gradient. Empirical results on FashionMNIST and CIFAR10 show that QPC-based models can achieve competitive or superior performance with fewer parameters than their classical counterparts, with deeper quantum layers yielding the best gains. This work suggests that quantum kernels can enhance CNNs' expressiveness in a resource-efficient manner, offering a viable path for integrating quantum circuits into mainstream CNN architectures on near-term hardware.

Abstract

In this study, we propose a novel architecture, the Quantum Pointwise Convolution, which incorporates pointwise convolution within a quantum neural network framework. Our approach leverages the strengths of pointwise convolution to efficiently integrate information across feature channels while adjusting channel outputs. By using quantum circuits, we map data to a higher-dimensional space, capturing more complex feature relationships. To address the current limitations of quantum machine learning in the Noisy Intermediate-Scale Quantum (NISQ) era, we implement several design optimizations. These include amplitude encoding for data embedding, allowing more information to be processed with fewer qubits, and a weight-sharing mechanism that accelerates quantum pointwise convolution operations, reducing the need to retrain for each input pixels. In our experiments, we applied the quantum pointwise convolution layer to classification tasks on the FashionMNIST and CIFAR10 datasets, where our model demonstrated competitive performance compared to its classical counterpart. Furthermore, these optimizations not only improve the efficiency of the quantum pointwise convolutional layer but also make it more readily deployable in various CNN-based or deep learning models, broadening its potential applications across different architectures.

Quantum Pointwise Convolution: A Flexible and Scalable Approach for Neural Network Enhancement

TL;DR

The paper addresses the limitation of linear channel interactions in classical pointwise convolution by introducing Quantum Pointwise Convolution (QPC), a hybrid quantum-classical approach that uses amplitude encoding and strongly entangling circuit blocks to capture nonlinear cross-channel relationships. The method generates multiple feature maps per quantum kernel and leverages weight sharing to reduce parameter count, enabling end-to-end training with a parameter-shift gradient. Empirical results on FashionMNIST and CIFAR10 show that QPC-based models can achieve competitive or superior performance with fewer parameters than their classical counterparts, with deeper quantum layers yielding the best gains. This work suggests that quantum kernels can enhance CNNs' expressiveness in a resource-efficient manner, offering a viable path for integrating quantum circuits into mainstream CNN architectures on near-term hardware.

Abstract

In this study, we propose a novel architecture, the Quantum Pointwise Convolution, which incorporates pointwise convolution within a quantum neural network framework. Our approach leverages the strengths of pointwise convolution to efficiently integrate information across feature channels while adjusting channel outputs. By using quantum circuits, we map data to a higher-dimensional space, capturing more complex feature relationships. To address the current limitations of quantum machine learning in the Noisy Intermediate-Scale Quantum (NISQ) era, we implement several design optimizations. These include amplitude encoding for data embedding, allowing more information to be processed with fewer qubits, and a weight-sharing mechanism that accelerates quantum pointwise convolution operations, reducing the need to retrain for each input pixels. In our experiments, we applied the quantum pointwise convolution layer to classification tasks on the FashionMNIST and CIFAR10 datasets, where our model demonstrated competitive performance compared to its classical counterpart. Furthermore, these optimizations not only improve the efficiency of the quantum pointwise convolutional layer but also make it more readily deployable in various CNN-based or deep learning models, broadening its potential applications across different architectures.

Paper Structure

This paper contains 28 sections, 18 equations, 8 figures.

Figures (8)

  • Figure 1: The structure of quantum circuit with strongly entanglement layers (Blocks). The circuit first employs amplitude encoding, where the classical input data is embedded into the quantum state by varying the amplitude of the quantum states across the 6 qubits. Then, apply the strongly entangling circuit architecture, designed with 6 qubits and organized into 2 sequential blocks $B_1$ and $B_2$. Each block consists of single-qubit gates $R_X$ and $R_Z$ rotations and CNOT gate applied to every qubit, ensuring that each qubit undergoes individual quantum operations. Each qubit’s state is measured using Pauli-Z operators
  • Figure 2: The structure of a quantum pointwise convolutional layer. From a multi-channel feature map, collect pixel values at the same position across channels and concatenate them into a vector. Feed this vector into quantum pointwise convolution kernels. After measuring each qubit, assign the measurement values back to the corresponding position in the new feature map. By iterating through the entire feature map, multiple new feature maps will be generated, depending on the number of qubits and kernels used.
  • Figure 3: In (a), the quantum model architecture applies quantum pointwise convolution operations: it starts with a quantum 1x1 convolution layer (64 channels), followed by a classical 3x3 convolution layer (64 channels) with ReLU activation, and concludes with a quantum 1x1 convolution layer that expands to 128 channels. In (b), the classical model begins with a 1x1 convolution layer (64 channels) followed by Batch Normalization (BN) and ReLU. This is followed by a 3x3 convolution layer (64 channels) with BN and ReLU, ending with a 1x1 convolution layer that increases the channels to 128, also followed by BN and ReLU.
  • Figure 4: The figure presents a comparison between the quantum and classical models in terms of loss over the training epochs for classification on the FashionMNIST dataset.
  • Figure 5: The figure presents a comparison between the quantum and classical models in terms of accuracy over the training epochs for classification on the FashionMNIST dataset.
  • ...and 3 more figures