Table of Contents
Fetching ...

Trainability Enhancement of Parameterized Quantum Circuits via Reduced-Domain Parameter Initialization

Yabo Wang, Bo Qi, Chris Ferrie, Daoyi Dong

TL;DR

The paper tackles the difficulty of training parameterized quantum circuits (PQCs) due to barren plateaus and spurious local minima. It introduces a depth-aware reduced-domain parameter initialization, showing that choosing the initial domain size $a=\Theta(1/\sqrt{L})$ yields gradient norms that decay only polynomially with circuit depth, and, for local Pauli-sum Hamiltonians, scale favorably with the number of terms. Theoretical results provide explicit lower bounds on $\mathbb{E}\|\nabla_{\boldsymbol{\theta}}C\|^2$ and gradient variances, implying avoidance of barren plateaus under polynomial-depth regimes. Numerical experiments on VQE (HEA and Hamiltonians with local Pauli terms) and QNNs corroborate the theory: reduced-domain initialization enhances trainability, accelerates convergence, and improves the ability to generate entanglement, even under finite-shot noise. Overall, the method offers a principled and practical initialization tactic to enhance the trainability and convergence of VQAs, with potential to unlock quantum advantages in realistic tasks.

Abstract

Parameterized quantum circuits (PQCs) have been widely used as a machine learning model to explore the potential of achieving quantum advantages for various tasks. However, training PQCs is notoriously challenging owing to the phenomenon of plateaus and/or the existence of (exponentially) many spurious local minima. To enhance trainability, in this work we propose an efficient parameter initialization strategy with theoretical guarantees. We prove that by reducing the initial domain of each parameter inversely proportional to the square root of circuit depth, the magnitude of the cost gradient decays at most polynomially with respect to qubit count and circuit depth. Our theoretical results are substantiated through numerical simulations of variational quantum eigensolver tasks. Moreover, we demonstrate that the reduced-domain initialization strategy can protect specific quantum neural networks from exponentially many spurious local minima. Our results highlight the significance of an appropriate parameter initialization strategy, offering insights to enhance the trainability and convergence of variational quantum algorithms.

Trainability Enhancement of Parameterized Quantum Circuits via Reduced-Domain Parameter Initialization

TL;DR

The paper tackles the difficulty of training parameterized quantum circuits (PQCs) due to barren plateaus and spurious local minima. It introduces a depth-aware reduced-domain parameter initialization, showing that choosing the initial domain size yields gradient norms that decay only polynomially with circuit depth, and, for local Pauli-sum Hamiltonians, scale favorably with the number of terms. Theoretical results provide explicit lower bounds on and gradient variances, implying avoidance of barren plateaus under polynomial-depth regimes. Numerical experiments on VQE (HEA and Hamiltonians with local Pauli terms) and QNNs corroborate the theory: reduced-domain initialization enhances trainability, accelerates convergence, and improves the ability to generate entanglement, even under finite-shot noise. Overall, the method offers a principled and practical initialization tactic to enhance the trainability and convergence of VQAs, with potential to unlock quantum advantages in realistic tasks.

Abstract

Parameterized quantum circuits (PQCs) have been widely used as a machine learning model to explore the potential of achieving quantum advantages for various tasks. However, training PQCs is notoriously challenging owing to the phenomenon of plateaus and/or the existence of (exponentially) many spurious local minima. To enhance trainability, in this work we propose an efficient parameter initialization strategy with theoretical guarantees. We prove that by reducing the initial domain of each parameter inversely proportional to the square root of circuit depth, the magnitude of the cost gradient decays at most polynomially with respect to qubit count and circuit depth. Our theoretical results are substantiated through numerical simulations of variational quantum eigensolver tasks. Moreover, we demonstrate that the reduced-domain initialization strategy can protect specific quantum neural networks from exponentially many spurious local minima. Our results highlight the significance of an appropriate parameter initialization strategy, offering insights to enhance the trainability and convergence of variational quantum algorithms.
Paper Structure (15 sections, 9 theorems, 89 equations, 7 figures, 1 table)

This paper contains 15 sections, 9 theorems, 89 equations, 7 figures, 1 table.

Key Result

Theorem 1

Consider the VQE problem, where the cost function is Eq. cost with the Hamiltonian being Eq. generalH, the ansatz illustrated as in Fig. fig:setup(a), the initial state chosen as $\rho=|0\rangle\langle0|$, and each parameter in $\boldsymbol{\theta}$ chosen from $\left[ -a\pi,a\pi\right]$ independent where $|\mathcal{N}|$ denotes the cardinality of the set $\mathcal{N}$. Furthermore, if the hyperpa

Figures (7)

  • Figure 1: Setup of HEA. (a) PQC for VQE. It consists of $L$ blocks. In the $l$th block, we first perform the entanglement layer composed of $\mathrm{CZ}$ gates, and then successively perform $R_X$ gate and $R_Y$ gate on each qubit. (b) $\mathrm{CZ}_{l}$ layer with the fully connected topology. (c) $\mathrm{CZ}_{l}$ layer with the nearest-neighbor pairs topology and open boundary condition.
  • Figure 2: Gradient magnitudes of the Heisenberg model at the initial step versus the number of qubits under five initialization strategies: zero-initialization (not displayed where the gradient values are consistently zero), uniform (square), Floquet (cross), Gaussian (triangle), and reduced-domain (circle). Here, $L=5N$, $W=0.2$, $\sigma^2=\frac{1}{8SL}=\frac{1}{80N}$, and $a$ is determined according to Eq. \ref{['TH2a']}. There exists a BP under the uniform initialization, whereas the other three strategies help PQCs avoid BPs. Under the three advanced initialization strategies, the gradient norms increase exponentially with the qubit number, among which the reduced-domain exhibits the most favorable growth.
  • Figure 3: Convergence behavior of the $12$-qubit Heisenberg model under different initializations in the HEA context. The dash-dotted grey line corresponds to that of the global minimum. Here, $L=3.5N=42$. (a) Convergence of the cost values under different hyperparameters of the reduced-domain initialization: $a^{*}\approx0.0736328$ (solid blue), $a=0.7$ (dotted red), $a=0.07$ (thin dashed green), and $a=0.007$ (dashed orange). (b) Convergence of the cost values under different initializations. (c) Convergence of the average von Neumann entropy under different initializations. Here, zero-initialization: dotted dark red, uniform: dotted red, Floquet: thin dashed orange, Gaussian: dashed green, and reduced-domain: solid blue. The corresponding hyperparameters are $W=0.4$, $\sigma^2=\frac{1}{8SL}=\frac{1}{672}\approx 0.001488$, and $a^{*}=0.0736328$, respectively.
  • Figure 4: Setup of $L$-block HVA for the 12-qubit Heisenberg model. Here, $L=2N=24$. The purple shaded consists of Pauli-$X$ gates, Hadamard gates and CX gates. It denotes the preparation of the initial reference state $\otimes^{6}|\Psi^{-}\rangle$ with $|\Psi^{-}\rangle=(1/\sqrt{2})(01\rangle-|10\rangle)$. The two-qubit parameterized rotation gates are $R_{\Gamma\Gamma}\left(x\right)=\exp{\left\lbrace-i\frac{x}{2}\Gamma\otimes\Gamma\right\rbrace}$ with $\Gamma\Gamma\in\{XX,~YY,~ZZ\}$, and $x\in\{\theta_{l}, \phi_{l}, \beta_{l}, \gamma_{l}\}$.
  • Figure 5: Convergence behavior of the $12$-qubit Heisenberg model under different initializations in the HVA context. The dash-dotted grey line corresponds to that of the global minimum. Here, $L=2N=24$. (a) Convergence of the cost values under different hyperparameters of the reduced-domain initialization $\mathcal{U}\left[-a\pi, a\pi\right]$: $a=0.07$ (solid blue), $a=0.7$ (dotted red), and $a=0.007$ (dashed orange). (b) Convergence of the cost values under different initializations. Here, zero-initialization: dotted dark red, $\pi$-initialization: dash dotted black, uniform: dotted red, Floquet: thin dashed orange, Gaussian: dashed green, and reduced-domain: solid blue. The corresponding hyperparameters are $W=0.4$, $\sigma^2= 0.001488$, and $a=0.07$, respectively.
  • ...and 2 more figures

Theorems & Definitions (15)

  • Theorem 1
  • Lemma 2
  • Theorem 3
  • Lemma 4
  • Lemma 6
  • proof
  • Corollary 7
  • Lemma 8
  • proof
  • Corollary 9
  • ...and 5 more