Laziness, Barren Plateau, and Noise in Machine Learning
Junyu Liu, Zexi Lin, Liang Jiang
TL;DR
The paper defines laziness as a large suppression of variational parameter updates in quantum circuits and differentiates it from barren plateaus, arguing that laziness need not impede learning in the overparametrized regime. It formulates a quantum neural tangent kernel (QNTK) framework and shows that, under random 2-designs, the average kernel scales as $\bar{K} \approx \frac{2L \mathrm{Tr}(O^2)}{N^2}$ and concentrates for large $L$, enabling exponential decay of the residual error via $\varepsilon(t) = (1 - \eta K)^t \varepsilon(0)$; this provides a precision-based view that laziness does not equate to algorithmic failure and helps explain training dynamics in variational quantum algorithms. The work further analyzes the impact of measurement and control noise, deriving a stochastic recurrence that yields a late-time plateau $\mathcal{L}(\infty) \approx \frac{\sigma_\theta^2}{2\eta(2 - \eta K)}$, and suggests operating in the overparametrized regime with $\eta K = O(1)$ to maintain performance despite noise. Finally, it connects quantum and classical learning via the neural tangent kernel perspective, discusses design trade-offs between expressibility and barren-plateau avoidance, and outlines directions for near-term quantum devices and broader theoretical links to classical machine learning.
Abstract
We define \emph{laziness} to describe a large suppression of variational parameter updates for neural networks, classical or quantum. In the quantum case, the suppression is exponential in the number of qubits for randomized variational quantum circuits. We discuss the difference between laziness and \emph{barren plateau} in quantum machine learning created by quantum physicists in \cite{mcclean2018barren} for the flatness of the loss function landscape during gradient descent. We address a novel theoretical understanding of those two phenomena in light of the theory of neural tangent kernels. For noiseless quantum circuits, without the measurement noise, the loss function landscape is complicated in the overparametrized regime with a large number of trainable variational angles. Instead, around a random starting point in optimization, there are large numbers of local minima that are good enough and could minimize the mean square loss function, where we still have quantum laziness, but we do not have barren plateaus. However, the complicated landscape is not visible within a limited number of iterations, and low precision in quantum control and quantum sensing. Moreover, we look at the effect of noises during optimization by assuming intuitive noise models, and show that variational quantum algorithms are noise-resilient in the overparametrization regime. Our work precisely reformulates the quantum barren plateau statement towards a precision statement and justifies the statement in certain noise models, injects new hope toward near-term variational quantum algorithms, and provides theoretical connections toward classical machine learning. Our paper provides conceptual perspectives about quantum barren plateaus, together with discussions about the gradient descent dynamics in \cite{together}.
