Spectral Bias in Variational Quantum Machine Learning
Callum Duffy, Marcin Jastrzebski
TL;DR
This work provides a rigorous link between spectral bias in PQCs and the redundancy of Fourier coefficients induced by data encodings. It presents a Fourier-based framework to bound gradients at each frequency and shows experimentally that higher redundancy accelerates learning of those frequencies, while initialization and entanglement modulate the effect. The findings suggest practical circuit-design strategies to tailor the spectrum for specific tasks and offer a robustness perspective tied to redundancy. While demonstrated on synthetic tasks with single-qubit encodings, the approach points to broader applicability in guiding PQC architectures for improved high-frequency generalization.
Abstract
In this work, we investigate the phenomenon of spectral bias in quantum machine learning, where, in classical settings, models tend to fit low-frequency components of a target function earlier during training than high-frequency ones, demonstrating a frequency-dependent rate of convergence. We study this effect specifically in parameterised quantum circuits (PQCs). Leveraging the established formulation of PQCs as Fourier series, we prove that spectral bias in this setting arises from the ``redundancy'' of the Fourier coefficients, which denotes the number of terms in the analytical form of the model contributing to the same frequency component. The choice of data encoding scheme dictates the degree of redundancy for a Fourier coefficient. We find that the magnitude of the Fourier coefficients' gradients during training strongly correlates with the coefficients' redundancy. We then further demonstrate this empirically with three different encoding schemes. Additionally, we demonstrate that PQCs with greater redundancy exhibit increased robustness to random perturbations in their parameters at the corresponding frequencies. We investigate how design choices affect the ability of PQCs to learn Fourier sums, focusing on parameter initialization scale and entanglement structure, finding large initializations and low-entanglement schemes tend to slow convergence.
