Table of Contents
Fetching ...

Quantum Optimization for Training Quantum Neural Networks

Yidong Liao, Min-Hsiu Hsieh, Chris Ferrie

TL;DR

The paper addresses training quantum neural networks on NISQ devices, where barren plateaus hinder classical optimisers. It introduces a maximally quantum training framework based on phase-oracle cost encoding and adaptive QAOA-like mixers (AC-QAOA), plus a gas-like baseline using Grover adaptive search. By replacing phase-encoding with phase oracles and using adaptive mixers, the approach aims to achieve beyond-Grover speedups and better handling of barren plateaus, with applications to VQE, pure-state generation, and quantum classification. The framework leverages amplitude and Hadamard tests for cost encoding, LCU and phase-estimation techniques, and machine-learning-inspired mixer selection to realize shallow, structured quantum training with potential practical impact on near-term quantum devices.

Abstract

Training quantum neural networks (QNNs) using gradient-based or gradient-free classical optimisation approaches is severely impacted by the presence of barren plateaus in the cost landscapes. In this paper, we devise a framework for leveraging quantum optimisation algorithms to find optimal parameters of QNNs for certain tasks. To achieve this, we coherently encode the cost function of QNNs onto relative phases of a superposition state in the Hilbert space of the network parameters. The parameters are tuned with an iterative quantum optimisation structure using adaptively selected Hamiltonians. The quantum mechanism of this framework exploits hidden structure in the QNN optimisation problem and hence is expected to provide beyond-Grover speed up, mitigating the barren plateau issue.

Quantum Optimization for Training Quantum Neural Networks

TL;DR

The paper addresses training quantum neural networks on NISQ devices, where barren plateaus hinder classical optimisers. It introduces a maximally quantum training framework based on phase-oracle cost encoding and adaptive QAOA-like mixers (AC-QAOA), plus a gas-like baseline using Grover adaptive search. By replacing phase-encoding with phase oracles and using adaptive mixers, the approach aims to achieve beyond-Grover speedups and better handling of barren plateaus, with applications to VQE, pure-state generation, and quantum classification. The framework leverages amplitude and Hadamard tests for cost encoding, LCU and phase-estimation techniques, and machine-learning-inspired mixer selection to realize shallow, structured quantum training with potential practical impact on near-term quantum devices.

Abstract

Training quantum neural networks (QNNs) using gradient-based or gradient-free classical optimisation approaches is severely impacted by the presence of barren plateaus in the cost landscapes. In this paper, we devise a framework for leveraging quantum optimisation algorithms to find optimal parameters of QNNs for certain tasks. To achieve this, we coherently encode the cost function of QNNs onto relative phases of a superposition state in the Hilbert space of the network parameters. The parameters are tuned with an iterative quantum optimisation structure using adaptively selected Hamiltonians. The quantum mechanism of this framework exploits hidden structure in the QNN optimisation problem and hence is expected to provide beyond-Grover speed up, mitigating the barren plateau issue.

Paper Structure

This paper contains 39 sections, 83 equations, 30 figures, 3 tables.

Figures (30)

  • Figure 1: Schematic of Our quantum training algorithm for VQE. Here we use the training of VQE as an example, to present the schematic circuit construction of our quantum training algorithm for QNN. A video animation of the circuit construction is available at https://youtu.be/RVWkJZY6GNY. (This is vector image and best view with the zoom feature in standard PDF viewers.) Note: 1. In all figures of this Paper, we omit the minus signs in all time-evolution-like terms (i.e. exponential of a Hamiltonian $e^{-iHt}$) for sake of brevity and space. 2. Some quantum registers are not depicted in this figure due to the limitation of space.
  • Figure 2: QAOA-like training protocol for QNN, proposed in Ref. verdon2018universal. The quantum training protocol consists of two alternating operations in a QAOA fashion --- the first operation acts on both the parameter register and QNN register to encode the cost function of QNN onto a relative phase of the parameter state. This operation is represented by the blue blocks in the figure. The second operation acts only on the parameter register and it is a variant of the original QAOA Mixers, tailored for the case that the parameters in the QNN are continuous variables. This operation is represented by the pink blocks in the figure. These two operation can be mathematically expressed as $e^{-i \gamma_i C(\boldsymbol \theta)}$ and $e^{-i\beta_i H_M}$, where $\boldsymbol \theta$ are the parameters of QNN, $C(\boldsymbol \theta)$ is the cost function of the QNN, and $\gamma_i$ and $\beta_i$ are tunable hyperparameters,$H_M$ is the Mixer Hamiltonian. The width of each block represents the hyperparameters $\gamma_i$ and $\beta_i$ --- the wider the block, the larger the value of the hyperparameters. The phase encoding operation $e^{-i \gamma_i H_C}$ act as $e^{-i \gamma_i C(\boldsymbol \theta)}$.
  • Figure 3: Schematic of our framework for quantum training of QNN. Our quantum training for QNN taking advantage of the well-established parts in Refs. verdon2018universal and Gily_n_2019, while eliminating their shortcomings. We replace the phase encoding operations in QAOA-like protocol of Ref. verdon2018universal(as depicted in Fig \ref{['verdon']}) by the phase oracle in Ref. Gily_n_2019. For the mixers in the QAOA-like routine, we allow different mixers for each layer, similar to Ref. zhu2020adaptive. In this figure, the color of each block represents the nature of the corresponding Hamiltonian: different color corresponds to different Hamiltonian (One can see that the Cost Hamiltonian is the same throughout the training whereas the mixer varies from layer to layer). The mixers pool contains the proper mixers tailored to our QNN training problem. These rules also apply to the other circuit schematic in this paper.
  • Figure 4: Interference process of QAOA. QAOA is an interference-based algorithm such that non-target states interfere destructively while the target states interfere constructively. Here we illustrate this interference process by presenting the evolution of the quantum state of the parameters (black bar graphs on the yellow plane) alongside with the QAOA operations (blue and pink boxes on circuit lines, representing the Phase encoding and Mixers respectively). The starting state $\sum_{\theta} \ket{\theta}$ (omitting the normalization factor) is the even superposition state of all possible parameter configurations. After the first Phase encoding operation, the state becomes $\sum_\theta e^{-i{\gamma}_{1}C(\theta)} |{\theta}\rangle$ for which we use opacity of the bars indicate the value of the phase, the magnitudes of the amplitudes in the state remains unchanged. After the first Mixer, the state becomes $\sum_\theta \Psi_{C(\theta)}|{\theta}\rangle$ in which the magnitudes of the amplitudes in the state has changed. Similar process happens to the following operations, until the amplitudes of the optimal parameter configurations are amplified significantly (the furthest bar graph). The grey bar graph in the right corner is the cost function being optimized by QAOA.
  • Figure 5: Quantum circuit schematic of the operations in the original QAOA. The state is initialized by applying Hadamard gates on each qubit, represented as $H^{\otimes n}$. This results in the equal superposition state of all possible solutions. QAOA consists of alternating time evolution under the two Hamiltonians $H_C$ and $H_{\mathrm{M}}$ for $p$ rounds, where the duration in round $j$ is specified by the parameters $\gamma_j$ and $\beta_j$, respectively. In the original QAOA, the mixing Hamiltonian $H_{\mathrm{M}}$ is chosen as to be $H_{\mathrm{M}} = \sum_{j=1}^n X_j,$ After all $p$ rounds, the state becomes $\ket{\boldsymbol \beta, \boldsymbol \gamma} = e^{-i \beta_p H_{\mathrm{M}}}e^{-i \gamma_p H_C} \dots e^{-i \beta_2 H_{\mathrm{M}}}e^{-i \gamma_2 H_C}e^{-i \beta_1 H_{\mathrm{M}}}e^{-i \gamma_1 H_C} \ket{s}.$
  • ...and 25 more figures

Theorems & Definitions (2)

  • proof
  • proof