Gradient Analysis of Barren Plateau in Parameterized Quantum Circuits with multi-qubit gates

Yuhan Yao; Yoshihiko Hasegawa

Gradient Analysis of Barren Plateau in Parameterized Quantum Circuits with multi-qubit gates

Yuhan Yao, Yoshihiko Hasegawa

TL;DR

This study provides a refined framework for analyzing and optimizing Parameterized Quantum Circuits with complex multi-qubit gates, and applies this framework to single-layer and deep-layer circuits, deriving analytical results that quantify how gradient variance is co-determined by the size of the multi-qubit gate and the number of qubits, layers, and effective parameters.

Abstract

The emergence of the Barren Plateau phenomenon poses a significant challenge to quantum machine learning. While most Barren Plateau analyses focus on single-qubit rotation gates, the gradient behavior of Parameterized Quantum Circuits built from multi-qubit gates remains largely unexplored. In this work, we present a general theoretical framework for analyzing the gradient properties of Parameterized Quantum Circuits with multi-qubit gates. Our method generalizes the direct computation framework, bypassing the Haar random assumption on parameters and enabling the calculation of the gradient expectation and variance. We apply this framework to single-layer and deep-layer circuits, deriving analytical results that quantify how gradient variance is co-determined by the size of the multi-qubit gate and the number of qubits, layers, and effective parameters. Numerical simulations validate our findings. Our study provides a refined framework for analyzing and optimizing Parameterized Quantum Circuits with complex multi-qubit gates.

Gradient Analysis of Barren Plateau in Parameterized Quantum Circuits with multi-qubit gates

TL;DR

Abstract

Paper Structure (1 section, 13 equations, 5 figures, 2 tables)

This paper contains 1 section, 13 equations, 5 figures, 2 tables.

End Matter

Figures (5)

Figure 1: Comparison of different settings of the theoretical analysis in single-layer PQCs from Eq. \ref{['eq: single_layer_variance']} using numerical simulations. The plots compare the theoretical predictions (dashed lines) with numerical simulation results (scatter) under the condition $N_\mathrm{eff} = \frac{n}{s}$. The four subplots correspond to different values of the $s$-qubit gate: $s=1$ with $\frac{1}{4} (\frac{5}{12})^{n-1}$ (top-left), $s=2$ with $\frac{5}{48} (\frac{1}{3})^{\frac{n}{2}-1}$ (top-right), $s=3$ with $\frac{25}{576} (\frac{17}{126})^{\frac{n}{3}-1}$ (bottom-left), and $s=4$ with $\frac{125}{6912} (\frac{37}{510})^{\frac{n}{4}-1}$ (bottom-right).
Figure 2: Behavior of the gradient variance in single-layer PQCs as a function of $N_\mathrm{eff}$. The x-axis illustrates the transition from local observables (low $N_\mathrm{eff}$) to global observables (high $N_\mathrm{eff}$). The circuit qubit and generator are fixed at $n=18$ and $s=1$.
Figure 3: Numerical simulation of gradient variance behavior. (Left) Gradient variance $\mathrm{Var}[\partial_k \mathcal{L}]$ as a function of circuit depth $l$ (from 5 to 150) for different numbers of qubits $n$ (from $n=2, 4, \dots, 12$). The variance saturates as $l$ increases for all system sizes. (Right) Log-scale plot of the gradient variance in the deep layer (at $l=150$) as a function of the number of qubits $n$. The clear linear trend (dashed line) indicates an exponential decay of variance with $n$.
Figure 4: Gradient variance $\mathrm{Var}[\partial_k \mathcal{L}]$ as a function of the factor $\frac{s N_\mathrm{eff}}{l}$. The plot includes all experimental data for a fixed qubit count $n=12$ and observable $Z^{\otimes 12}$ across various $s$-qubit gate sizes ($s=1, 2, 3, 4, 6$) and different $\frac{N_\mathrm{eff}}{l}$ ratios via pruning. The points on the same gray line mean the same $sN_\mathrm{eff}$.
Figure 5: Example of the effective parameter. This PQCs consists of a $6$-qubit, $3$-layer circuit, and the size of the multi-qubit gate is $2$. The gray gates, despite containing trainable parameters, do not affect the final measurement output. This is typically because their scope of action does not affect the final observable, and therefore, they have no impact on the loss function's result. This phenomenon demonstrates that not all trainable parameters are effective for a given task.

Gradient Analysis of Barren Plateau in Parameterized Quantum Circuits with multi-qubit gates

TL;DR

Abstract

Gradient Analysis of Barren Plateau in Parameterized Quantum Circuits with multi-qubit gates

Authors

TL;DR

Abstract

Table of Contents

Figures (5)