Table of Contents
Fetching ...

Hardware-efficient ansatz without barren plateaus in any depth

Chae-Yeun Park, Minhyeok Kang, Joonsuk Huh

TL;DR

This paper proposes two novel parameter conditions in which the hardware-efficient ansatz (HEA) is free from barren plateaus for arbitrary circuit depths and argues that the HEA in this phase has a large gradient component for a local observable using a phenomenological model for the MBL system.

Abstract

Variational quantum circuits have recently gained much interest due to their relevance in real-world applications, such as combinatorial optimizations, quantum simulations, and modeling a probability distribution. Despite their huge potential, the practical usefulness of those circuits beyond tens of qubits is largely questioned. One of the major problems is the so-called barren plateaus phenomenon. Quantum circuits with a random structure often have a flat cost-function landscape and thus cannot be trained efficiently. In this paper, we propose two novel parameter conditions in which the hardware-efficient ansatz (HEA) is free from barren plateaus for arbitrary circuit depths. In the first condition, the HEA approximates to a time-evolution operator generated by a local Hamiltonian. Utilizing a recent result by [Park and Killoran, Quantum 8, 1239 (2024)], we prove a constant lower bound of gradient magnitudes in any depth both for local and global observables. On the other hand, the HEA is within the many-body localized (MBL) phase in the second parameter condition. We argue that the HEA in this phase has a large gradient component for a local observable using a phenomenological model for the MBL system. By initializing the parameters of the HEA using these conditions, we show that our findings offer better overall performance in solving many-body Hamiltonians. Our results indicate that barren plateaus are not an issue when initial parameters are smartly chosen, and other factors, such as local minima or the expressivity of the circuit, are more crucial.

Hardware-efficient ansatz without barren plateaus in any depth

TL;DR

This paper proposes two novel parameter conditions in which the hardware-efficient ansatz (HEA) is free from barren plateaus for arbitrary circuit depths and argues that the HEA in this phase has a large gradient component for a local observable using a phenomenological model for the MBL system.

Abstract

Variational quantum circuits have recently gained much interest due to their relevance in real-world applications, such as combinatorial optimizations, quantum simulations, and modeling a probability distribution. Despite their huge potential, the practical usefulness of those circuits beyond tens of qubits is largely questioned. One of the major problems is the so-called barren plateaus phenomenon. Quantum circuits with a random structure often have a flat cost-function landscape and thus cannot be trained efficiently. In this paper, we propose two novel parameter conditions in which the hardware-efficient ansatz (HEA) is free from barren plateaus for arbitrary circuit depths. In the first condition, the HEA approximates to a time-evolution operator generated by a local Hamiltonian. Utilizing a recent result by [Park and Killoran, Quantum 8, 1239 (2024)], we prove a constant lower bound of gradient magnitudes in any depth both for local and global observables. On the other hand, the HEA is within the many-body localized (MBL) phase in the second parameter condition. We argue that the HEA in this phase has a large gradient component for a local observable using a phenomenological model for the MBL system. By initializing the parameters of the HEA using these conditions, we show that our findings offer better overall performance in solving many-body Hamiltonians. Our results indicate that barren plateaus are not an issue when initial parameters are smartly chosen, and other factors, such as local minima or the expressivity of the circuit, are more crucial.
Paper Structure (17 sections, 6 theorems, 71 equations, 6 figures)

This paper contains 17 sections, 6 theorems, 71 equations, 6 figures.

Key Result

Theorem 1

Let $C(\pmb{\theta}) = \braket{\psi(\pmb{\theta})|O|\psi(\pmb{\theta})}$ be the cost function where $O$ is either a Pauli string or $k$-local Hamiltonian. Suppose that there exist $n,m$ such that $|\partial_{n,m} C |_{\pmb{\theta}=0} = \Omega(1)$. Then, there exists a constant $\gamma > 0$ such that

Figures (6)

  • Figure 1: Circuit identity used for removing CZ gates from the HEA. Using the property that the CZ gate is a Clifford gate, we can move CZ gates in each block to the beginning of the block.
  • Figure 2: Averaged squared gradients as functions of $N$ for $p \in [32, 64, 128]$. Observables (a) $O=Y_1$ and (b) $O=Y_1 \prod_{j =2}^N Z_j$ are used. Each data point presents the averaged gradient components for the RX gate acting on the first qubit, $\sum_{i=1}^p (\partial_{i,0}C)^2/p$. For each parameter initialization scheme, results are averaged over $2^{10}$ randomly sampled parameters. For the Small initialization, the gradient magnitudes do not decay with $N$ regardless of the observable. On the other hand, the MBL initialization shows $\Theta(1)$ gradient magnitudes when a local observable is used, whereas they decay exponentially for a global observable.
  • Figure 3: Normalized energies $\widetilde{E} = (\braket{H_{1,2}} - E_{\rm GS})/|E_{\rm GS}|$ as functions of optimization steps for (a) the Heisenberg model ($H_1$) and (b) the cluster model ($H_2$) with external fields. The HEA with $N=20$ and $p=256$ is used. We optimize the parameters using Adam kingma2014adam with learning rates (a) $\eta = 0.005$ and (b) $\eta = 0.001$, which are chosen from hyperparameter optimizations. For each initialization scheme, we run $16$ independent VQE instances. Solid curves show the averaged values for each step, while the shaded regions indicate the range between the worst and best-performing instances.
  • Figure B.1: Many-body localization of a unitary operator $\tilde{V}(\theta)$. (a) Half-chain entanglement entropy for eigenstates of $\tilde{V}(\theta)$ as a function of $\theta/\pi$. Results are averaged over all eigenstates and disorder realizations. Dashed horizontal lines indicate the Page entropy, which is expected for Haar random states. (b) Variance of the eigenstate entanglement entropy averaged over disorder realizations. For each random instance of $\tilde{V}(\theta)$, we compute $\overline{S_E^2} - \overline{S_E}^2$, and the results are averaged over all instances. (c) The averaged adjacent gap ratios. For ordered quasi-energy levels $\{E_i\}$ for each random instance of $\tilde{V}(\theta)$, gaps $\Delta_i = E_{i+1}-E_i$ are obtained. Then, the ratios $r_i=\min\{\Delta_{i+1}/\Delta_i,\Delta_{i}/\Delta_{i+1}\}$ are averaged over $i$ and all random instances. Horizontal lines indicate the expected averaged values of $r$ for the Possion (dashed) and the Gaussian orthogonal ensemble (dotted). All presented results are obtained from $2^{12}$ random instances for $N \in [8, 10]$, $2^{10}$ for $N=12$, and $2^7$ for $N=14$.
  • Figure C.2: Scaling of gradients for the 1D (left column) and 2D HEAs (right colume) with observables $O=Y_1$ (first row) and $O=Y_1\prod_{j=2}^N Z_j$ (second row). The number of blocks $p\in [32,64,128]$, $p \in [16, 32,64]$ are used for the 1D and 2D HEA, respectively. The weight of the observable is given by $S=1$ for $O=Y_1$ and $S=N$ for $O=Y_1\prod_{j=2}^N Z_j$.
  • ...and 1 more figures

Theorems & Definitions (11)

  • Theorem 1
  • Lemma A.1
  • proof
  • Lemma A.2: Proposition 3 in Ref. park2024hamiltonian
  • Theorem A.1
  • proof
  • Lemma A.3
  • proof
  • Theorem A.2: Restatement of Theorem 1 in the main text
  • Remark
  • ...and 1 more