Table of Contents
Fetching ...

Quantum-Enhanced Weight Optimization for Neural Networks Using Grover's Algorithm

Stefan-Alexandru Jura, Mihai Udrescu

TL;DR

The paper addresses the challenge of weight optimization in neural networks by replacing gradient-based updates with a quantum-accelerated, gradient-free search using Grover's algorithm. By discretizing each weight's update space and applying amplitude amplification to select the best candidate, the method achieves a per-weight complexity of $O\left(\sqrt{N}\right)$ and reduces overall training time compared to backpropagation while maintaining or improving accuracy on small to medium datasets. Experiments on Wine and Digits demonstrate rapid convergence, high accuracy (up to 100% in some cases), and robustness to moderate noise, with a scalable architecture that can support deeper networks using a small number of qubits. The work highlights practical advantages for near-term quantum devices, offering a viable path toward quantum-enhanced optimization of classical neural networks and motivating further hardware-oriented refinements and larger-scale evaluations.

Abstract

The main approach to hybrid quantum-classical neural networks (QNN) is employing quantum computing to build a neural network (NN) that has quantum features, which is then optimized classically. Here, we propose a different strategy: to use quantum computing in order to optimize the weights of a classical NN. As such, we design an instance of Grover's quantum search algorithm to accelerate the search for the optimal parameters of an NN during the training process, a task traditionally performed using the backpropagation algorithm with the gradient descent method. Indeed, gradient descent has issues such as exploding gradient, vanishing gradient, or convexity problem. Other methods tried to address such issues with strategies like genetic searches, but they carry additional problems like convergence consistency. Our original method avoids these issues -- because it does not calculate gradients -- and capitalizes on classical architectures' robustness and Grover's quadratic speedup in high-dimensional search spaces to significantly reduce test loss (58.75%) and improve test accuracy (35.25%), compared to classical NN weight optimization, on small datasets. Unlike most QNNs that are trained on small datasets only, our method is also scalable, as it allows the optimization of deep networks; for an NN with 3 hidden layers, trained on the Digits dataset from scikit-learn, we obtained a mean accuracy of 97.7%. Moreover, our method requires a much smaller number of qubits compared to other QNN approaches, making it very practical for near-future quantum computers that will still deliver a limited number of logical qubits.

Quantum-Enhanced Weight Optimization for Neural Networks Using Grover's Algorithm

TL;DR

The paper addresses the challenge of weight optimization in neural networks by replacing gradient-based updates with a quantum-accelerated, gradient-free search using Grover's algorithm. By discretizing each weight's update space and applying amplitude amplification to select the best candidate, the method achieves a per-weight complexity of and reduces overall training time compared to backpropagation while maintaining or improving accuracy on small to medium datasets. Experiments on Wine and Digits demonstrate rapid convergence, high accuracy (up to 100% in some cases), and robustness to moderate noise, with a scalable architecture that can support deeper networks using a small number of qubits. The work highlights practical advantages for near-term quantum devices, offering a viable path toward quantum-enhanced optimization of classical neural networks and motivating further hardware-oriented refinements and larger-scale evaluations.

Abstract

The main approach to hybrid quantum-classical neural networks (QNN) is employing quantum computing to build a neural network (NN) that has quantum features, which is then optimized classically. Here, we propose a different strategy: to use quantum computing in order to optimize the weights of a classical NN. As such, we design an instance of Grover's quantum search algorithm to accelerate the search for the optimal parameters of an NN during the training process, a task traditionally performed using the backpropagation algorithm with the gradient descent method. Indeed, gradient descent has issues such as exploding gradient, vanishing gradient, or convexity problem. Other methods tried to address such issues with strategies like genetic searches, but they carry additional problems like convergence consistency. Our original method avoids these issues -- because it does not calculate gradients -- and capitalizes on classical architectures' robustness and Grover's quadratic speedup in high-dimensional search spaces to significantly reduce test loss (58.75%) and improve test accuracy (35.25%), compared to classical NN weight optimization, on small datasets. Unlike most QNNs that are trained on small datasets only, our method is also scalable, as it allows the optimization of deep networks; for an NN with 3 hidden layers, trained on the Digits dataset from scikit-learn, we obtained a mean accuracy of 97.7%. Moreover, our method requires a much smaller number of qubits compared to other QNN approaches, making it very practical for near-future quantum computers that will still deliver a limited number of logical qubits.

Paper Structure

This paper contains 13 sections, 5 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Our proposed method's architecture, which can be extended to an arbitrary number of hidden layers. The dataset is split into $80\%$ for training and $20\%$ for testing. Then normalization, one-hot encoding, search interval definition, and discretization are performed. The images are flattened into a one-dimensional array and fed into a hidden (dense) layer of 64 units. The hidden layer sends candidate weights to the Grover circuit, and the circuit returns updated weights via amplitude amplification, which is then integrated into the next hidden layer, which is made of $32$ units. The process is then repeated for the final $16$ nodes layer, and softmax is applied to determine the network's outputs.
  • Figure 2: Evolution of training and test loss over 10 epochs, comparing classical ADAM (blue) and quantum Grover-optimized (red) neural networks. Circle markers refer to training loss, while dashed lines indicate test loss. The quantum-optimized approach acquires faster convergence and achieves a lower final loss than the classical method.
  • Figure 3: Comparison between training and improvement in test accuracy over 10 epochs. The accuracy is represented by dashed lines for test and circle markers for training. The quantum-optimized model is able to learn the data structure very quickly and achieve near-optimal accuracy within the first few epochs of training, while the classical approach has a more gradual improvement in accuracy.
  • Figure 4: Illustration on how the final test accuracy changes as we vary the hidden resolution (i.e., the number of candidate values for each layer) from 17 to 32. For resolutions between 17 and 31, the test accuracy remains consistently high—around 97% or more—indicating stable and robust performance. However, at resolution 32, the accuracy suddenly drops to approximately 91%, suggesting that excessively fine discretization of the candidate space can lead to numerical instability or degraded model performance.
  • Figure 5: Illustration on how the final test loss evolves as the hidden resolution (i.e., the number of candidate values for each layer) increases from 17 to 32. For resolutions between 17 and 31, the test loss remains consistently low, indicating stable training and strong generalization. However, at resolution 32, the test loss spikes sharply to around 1.27, suggesting that an overly fine discretization of the weight space can destabilize the optimization process and degrade the model’s performance, namely overfitting.
  • ...and 4 more figures