Table of Contents
Fetching ...

Quantum-Classical Hybrid Quantized Neural Network

Wenxin Li, Chuan Wang, Hongdong Zhu, Qi Gao, Yin Ma, Hai Wei, Kai Wen

TL;DR

The paper develops a quantum-classical framework for training quantized neural networks by recasting learning as a QCBO via spline-based forward interval propagation, which enables arbitrary activation and loss functions. It establishes a copositive lifting to a completely positive program and introduces a scalable DLBO approach to decompose the problem into tractable per-sample subproblems solvable on quantum hardware. The authors prove convergence of the QCGD algorithm under noisy quantum oracle conditions and integrate Quantum Progressive Hedging (QPH) to coordinate distributed subproblems with consensus and hedging updates. Empirical results demonstrate competitive accuracy at ultra-low bit-widths (e.g., 1.1-bit) with significantly reduced training times, highlighting potential for quantum-accelerated, edge-deployable neural networks. Overall, the work provides a unified, provably robust route to leverage quantum optimization for scalable, high-performing quantized neural networks.

Abstract

In this work, we introduce a novel Quadratic Binary Optimization (QBO) framework for training a quantized neural network. The framework enables the use of arbitrary activation and loss functions through spline interpolation, while Forward Interval Propagation addresses the nonlinearities and the multi-layered, composite structure of neural networks via discretizing activation functions into linear subintervals. This preserves the universal approximation properties of neural networks while allowing complex nonlinear functions accessible to quantum solvers, broadening their applicability in artificial intelligence. Theoretically, we derive an upper bound on the approximation error and the number of Ising spins required by deriving the sample complexity of the empirical risk minimization problem from an optimization perspective. A key challenge in solving the associated large-scale Quadratic Constrained Binary Optimization (QCBO) model is the presence of numerous constraints. To overcome this, we adopt the Quantum Conditional Gradient Descent (QCGD) algorithm, which solves QCBO directly on quantum hardware. We establish the convergence of QCGD under a quantum oracle subject to randomness, bounded variance, and limited coefficient precision, and further provide an upper bound on the Time-To-Solution. To enhance scalability, we further incorporate a decomposed copositive optimization scheme that replaces the monolithic lifted model with sample-wise subproblems. This decomposition substantially reduces the quantum resource requirements and enables efficient low-bit neural network training. We further propose the usage of QCGD and Quantum Progressive Hedging (QPH) algorithm to efficiently solve the decomposed problem.

Quantum-Classical Hybrid Quantized Neural Network

TL;DR

The paper develops a quantum-classical framework for training quantized neural networks by recasting learning as a QCBO via spline-based forward interval propagation, which enables arbitrary activation and loss functions. It establishes a copositive lifting to a completely positive program and introduces a scalable DLBO approach to decompose the problem into tractable per-sample subproblems solvable on quantum hardware. The authors prove convergence of the QCGD algorithm under noisy quantum oracle conditions and integrate Quantum Progressive Hedging (QPH) to coordinate distributed subproblems with consensus and hedging updates. Empirical results demonstrate competitive accuracy at ultra-low bit-widths (e.g., 1.1-bit) with significantly reduced training times, highlighting potential for quantum-accelerated, edge-deployable neural networks. Overall, the work provides a unified, provably robust route to leverage quantum optimization for scalable, high-performing quantized neural networks.

Abstract

In this work, we introduce a novel Quadratic Binary Optimization (QBO) framework for training a quantized neural network. The framework enables the use of arbitrary activation and loss functions through spline interpolation, while Forward Interval Propagation addresses the nonlinearities and the multi-layered, composite structure of neural networks via discretizing activation functions into linear subintervals. This preserves the universal approximation properties of neural networks while allowing complex nonlinear functions accessible to quantum solvers, broadening their applicability in artificial intelligence. Theoretically, we derive an upper bound on the approximation error and the number of Ising spins required by deriving the sample complexity of the empirical risk minimization problem from an optimization perspective. A key challenge in solving the associated large-scale Quadratic Constrained Binary Optimization (QCBO) model is the presence of numerous constraints. To overcome this, we adopt the Quantum Conditional Gradient Descent (QCGD) algorithm, which solves QCBO directly on quantum hardware. We establish the convergence of QCGD under a quantum oracle subject to randomness, bounded variance, and limited coefficient precision, and further provide an upper bound on the Time-To-Solution. To enhance scalability, we further incorporate a decomposed copositive optimization scheme that replaces the monolithic lifted model with sample-wise subproblems. This decomposition substantially reduces the quantum resource requirements and enables efficient low-bit neural network training. We further propose the usage of QCGD and Quantum Progressive Hedging (QPH) algorithm to efficiently solve the decomposed problem.

Paper Structure

This paper contains 23 sections, 7 theorems, 110 equations, 7 figures, 4 tables.

Key Result

Lemma 1

The VC dimension of a neural network with piecewise polynomial activation functions is bounded by $O(W^{3}L^{2})$. Specifically, when the activation functions are piecewise constant, the VC dimension is no more than $O(W^{2}L(\log W + \log L))$.

Figures (7)

  • Figure 1: The flowchart of the quantized neural network training process using CIM and hybrid techniques.
  • Figure 2: (a) Feedforward neural network with piecewise linear activation function. (b) Schematic representation of forward interval propagation (FIP) in a neural network, illustrating the discretization of activation functions into linear subintervals. The diagram shows multiple input neurons (left) connected through weights $(w_{0}, w_{1},\ldots, w_{m})$ and a bias term $b$ to a summation node $\Sigma$, followed by an activation function. The green-highlighted intervals and values of $\beta_{i}$s indicate the specific subintervals that the activation function's output values belong to, determining the output subinterval for the next layer, effectively capturing multi-layer composite relationships during forward propagation.
  • Figure 3: The monolithic copositive formulation constructs a single large cone whose dimension scales with the number of samples $N$. DLBO replaces this with sample-wise complete positive cones that share global parameters $(\mathbf{x}, \mathbf{X})$. Each epoch solves per-sample QUBO subproblems on quantum Ising hardware and aggregates their solutions to update the shared parameters, enabling scalable quantum--classical training.
  • Figure 4: (a) Energy evolution curve during the optimization of a quantized neural network using a coherent Ising machine (CIM); (b) Comparative analysis of test accuracy and running time for different training algorithms (STE bengio2013estimating, BinaryConnect courbariaux2015binaryconnect); (c) Performance comparison of memory demand, inference time, and accuracy before and after quantization and activation function approximation.
  • Figure 5: The figure illustrates the convergence of the Quantum Conditional Gradient Descent (QCGD) algorithm using CIM (green and red) and Gurobi (cyan and purple) solvers over 321 iterations on a log-log scale. The left y-axis shows the objective residual, and the right y-axis shows the constraint residual, with lines representing the residuals for CIM (solid) and Gurobi (dashed).
  • ...and 2 more figures

Theorems & Definitions (9)

  • Lemma 1: bartlett2019nearlycover1968capacity
  • Theorem 1
  • Theorem 2
  • Lemma 2
  • Definition 1: $(\delta,\varepsilon)$-Inexact Oracle dunn1978conditionallocatello2017unified
  • Proposition 1
  • Lemma 3
  • Theorem 3
  • Example 1: Sign function