Avoiding barren plateaus via Gaussian Mixture Model
Xiao Shi, Yun Shang
TL;DR
This work tackles the barren plateau problem in variational quantum algorithms by introducing a Gaussian Mixture Model (GMM) initialization for parameter vectors in hardware-efficient PQCs. The authors prove, for single-term, multi-term, and general cost functions, that a GMM-based initialization yields a gradient norm lower bound that is independent of the number of qubits $N$ and scales with circuit depth $L$, with concrete bounds such as $\mathbb{E}\| abla f\|^2 \ge \frac{1}{4}-\frac{1}{8L}$ and extensions that include cross-terms for multi-term observables. They provide extensive numerical evidence on local (e.g., 1D TFIM) and global cost functions, as well as quantum-chemistry simulations (LiH with JW mapping), demonstrating robust training performance, improved gradient magnitudes, and faster convergence under noise. The results suggest that GMM initialization can enable training of larger and deeper PQCs on NISQ devices, with practical guidance for choosing distributions and variances. Overall, the paper offers both rigorous theoretical guarantees and practical validation that Gaussian Mixture Model initialization mitigates BP across a broad class of VQAs.
Abstract
Variational quantum algorithms is one of the most representative algorithms in quantum computing, which has a wide range of applications in quantum machine learning, quantum simulation and other related fields. However, they face challenges associated with the barren plateau phenomenon, especially when dealing with large numbers of qubits, deep circuit layers, or global cost functions, making them often untrainable. In this paper, we propose a novel parameter initialization strategy based on Gaussian Mixture Models. We rigorously prove that, the proposed initialization method consistently avoids the barren plateaus problem for hardware-efficient ansatz with arbitrary length and qubits and any given cost function. Specifically, we find that the gradient norm lower bound provided by the proposed method is independent of the number of qubits $N$ and increases with the circuit depth $L$. Our results strictly highlight the significance of Gaussian Mixture model initialization strategies in determining the trainability of quantum circuits, which provides valuable guidance for future theoretical investigations and practical applications.
