The Inductive Bias of Quantum Kernels
Jonas M. Kübler, Simon Buchholz, Bernhard Schölkopf
TL;DR
The paper analyzes when quantum kernel methods can outperform classical approaches by examining the inductive bias encoded in the kernel's spectrum. By modeling data embeddings into quantum density matrices, it shows that generalization is feasible only when the RKHS remains effectively low-dimensional or when a problem-specific bias is applied via biased (projected) kernels; otherwise the kernel's expressivity harms generalization and measuring the kernel becomes expensive. The authors prove bounds on the largest kernel eigenvalue, propose biased kernel constructions based on reduced density matrices, and demonstrate with experiments that the right bias enables learning from limited data while the wrong bias fails. They conclude that quantum advantages are plausible primarily when the data-generating process is naturally quantum or when a bias is encoded that is hard to replicate classically, implying limited prospects for quantum speedups on typical classical datasets.
Abstract
It has been hypothesized that quantum computers may lend themselves well to applications in machine learning. In the present work, we analyze function classes defined via quantum kernels. Quantum computers offer the possibility to efficiently compute inner products of exponentially large density operators that are classically hard to compute. However, having an exponentially large feature space renders the problem of generalization hard. Furthermore, being able to evaluate inner products in high dimensional spaces efficiently by itself does not guarantee a quantum advantage, as already classically tractable kernels can correspond to high- or infinite-dimensional reproducing kernel Hilbert spaces (RKHS). We analyze the spectral properties of quantum kernels and find that we can expect an advantage if their RKHS is low dimensional and contains functions that are hard to compute classically. If the target function is known to lie in this class, this implies a quantum advantage, as the quantum computer can encode this inductive bias, whereas there is no classically efficient way to constrain the function class in the same way. However, we show that finding suitable quantum kernels is not easy because the kernel evaluation might require exponentially many measurements. In conclusion, our message is a somewhat sobering one: we conjecture that quantum machine learning models can offer speed-ups only if we manage to encode knowledge about the problem at hand into quantum circuits, while encoding the same bias into a classical model would be hard. These situations may plausibly occur when learning on data generated by a quantum process, however, they appear to be harder to come by for classical datasets.
