Table of Contents
Fetching ...

Quantum neural network with ensemble learning to mitigate barren plateaus and cost function concentration

Lucas Friedrich, Jonas Maziero

TL;DR

This work tackles the persistent training challenges of quantum neural networks—barren plateaus and cost function concentration—by introducing an ensemble-based HQCNN that replaces a single depth-$L$ quantum block with an ensemble of depth-$1$ circuits. The quantum layer outputs are summed as $\mathbf{Y}=\sum_{l=1}^{L}\mathbf{y}_l$ and fed to subsequent layers, reducing depth and potentially mitigating noise on NISQ devices while preserving expressivity. Experiments on a MNIST subset (digits 0–2) show that the ensemble approach can increase derivative variance for certain parametrizations and lower the cost, with often comparable or improved accuracy and faster convergence for some configurations. The results indicate that ensemble-depth reduction offers a practical route to trainable QNNs, though performance depends on parametrization, hyperparameters, and initialization, underscoring the need for careful design in quantum-classical hybrids.

Abstract

The rapid development of quantum computers promises transformative impacts across diverse fields of science and technology. Quantum neural networks (QNNs), as a forefront application, hold substantial potential. Despite the multitude of proposed models in the literature, persistent challenges, notably the vanishing gradient (VG) and cost function concentration (CFC) problems, impede their widespread success. In this study, we introduce a novel approach to quantum neural network construction, specifically addressing the issues of VG and CFC. Our methodology employs ensemble learning, advocating for the simultaneous deployment of multiple quantum circuits with a depth equal to \(1\), a departure from the conventional use of a single quantum circuit with depth \(L\). We assess the efficacy of our proposed model through a comparative analysis with a conventionally constructed QNN. The evaluation unfolds in the context of a classification problem, yielding valuable insights into the potential advantages of our innovative approach.

Quantum neural network with ensemble learning to mitigate barren plateaus and cost function concentration

TL;DR

This work tackles the persistent training challenges of quantum neural networks—barren plateaus and cost function concentration—by introducing an ensemble-based HQCNN that replaces a single depth- quantum block with an ensemble of depth- circuits. The quantum layer outputs are summed as and fed to subsequent layers, reducing depth and potentially mitigating noise on NISQ devices while preserving expressivity. Experiments on a MNIST subset (digits 0–2) show that the ensemble approach can increase derivative variance for certain parametrizations and lower the cost, with often comparable or improved accuracy and faster convergence for some configurations. The results indicate that ensemble-depth reduction offers a practical route to trainable QNNs, though performance depends on parametrization, hyperparameters, and initialization, underscoring the need for careful design in quantum-classical hybrids.

Abstract

The rapid development of quantum computers promises transformative impacts across diverse fields of science and technology. Quantum neural networks (QNNs), as a forefront application, hold substantial potential. Despite the multitude of proposed models in the literature, persistent challenges, notably the vanishing gradient (VG) and cost function concentration (CFC) problems, impede their widespread success. In this study, we introduce a novel approach to quantum neural network construction, specifically addressing the issues of VG and CFC. Our methodology employs ensemble learning, advocating for the simultaneous deployment of multiple quantum circuits with a depth equal to , a departure from the conventional use of a single quantum circuit with depth . We assess the efficacy of our proposed model through a comparative analysis with a conventionally constructed QNN. The evaluation unfolds in the context of a classification problem, yielding valuable insights into the potential advantages of our innovative approach.
Paper Structure (13 sections, 14 equations, 12 figures)

This paper contains 13 sections, 14 equations, 12 figures.

Figures (12)

  • Figure 1: In this figure, we illustrate how expressibility can be interpreted as the number of unitaries $U^{i}$ accessed by a given parameterization $U_{S}$. We represent the solutions to two problems, $A$ and $B$, by the unitaries $U^{1}$ and $U^{2}$, respectively. Our goal is to obtain a parameterization $U_{S}$ that can access these two unitaries. The accessible space is depicted in dark gray for parameterizations $U_{A}$ and $U_{B}$. As we can observe, while $U_{A}$ can only access $U^{1}$, $U_{B}$ can access both solutions. In this case, we say that the expressibility of $U_{B}$ is greater than that of $U_{A}$.
  • Figure 2: Illustration of a hybrid quantum-classical neural network (HQCNN). A) This illustration shows an HQCNN model where two classical layers are followed by a quantum layer, and finally, another two classical layers are applied. In this example, the quantum layer is obtained using a single quantum circuit with depth $L$. The gates used to encode the data obtained from the classical layer are highlighted in green. The gates depending on parameters to be optimized are shown in blue. B) Illustration of an HQCNN model using the new method. In this example, the model consists of two classical layers and one quantum layer. In the quantum layer, instead of using a single quantum circuit of depth $L$, we employ $L$ quantum circuits of depth $1$.
  • Figure 3: Illustration of the three parametrizations $U_l(\pmb{\theta}_l)$ used in this study. In this context, IsingYY refers to the two-qubit rotation gate, defined as $R_{YY}(\theta) = e^{-i \frac{\theta}{2} Y \otimes Y}$.
  • Figure 4: Variance analysis of the partial derivative with respect to the quantum layer used by the standard HQCNN model and the HQCNN model based on the new method. In the left plot, the behavior of the variances is observed when using the parametrization presented in Fig. \ref{['fig:models']} A. In the central plot, the variances corresponding to the parametrization in Fig. \ref{['fig:models']} B are shown. Finally, in the right plot, the behavior of the variances obtained when employing the parametrization in Fig. \ref{['fig:models']} C is illustrated. To obtain these results, the depth $L$ was scaled linearly with the number of qubits, that is, $L = n$.
  • Figure 5: Comparative analysis of the cost function behavior when using the standard method and the new method. To obtain these results, quantum circuits with 8 qubits and depth $L = 40$ were employed in the case of the standard method, while for the new method, 40 quantum circuits were used, each with 8 qubits and depth $L = 1$. It is observed that, in all cases, the performance of the cost function when using the new method is superior to that obtained with the standard method.
  • ...and 7 more figures