Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels
Claudia Zendejas-Morales, Debashis Saikia, Utkarsh Singh
TL;DR
This work addresses the problem of exponential concentration in fidelity-based quantum kernels, where the Gram matrix collapses toward the identity as system size or circuit expressivity grows. It introduces two practical mitigation strategies—local patch-wise kernels and multi-scale kernel mixtures—implemented within Qiskit, and evaluates them across real and synthetic tabular datasets while sweeping the feature dimension $d$ up to 20. By examining diagnostics such as $p50$/$p95$ off-diagonal concentration, $r_{eff}$, and centered alignment with labels, the study shows that locality and scale mixing consistently reshape kernel geometry and yield a richer spectral structure, though improvements in SVM accuracy are dataset-dependent. The findings contribute actionable insights for designing quantum kernel pipelines, suggesting that locality and multiscale mixing can reduce concentration and expand informative kernel directions, with practical implications for scalable quantum-classical learning and hardware-aware deployments.
Abstract
Fidelity-based quantum kernels provide a direct interface between quantum feature maps and classical kernel methods, but they can exhibit exponential concentration: with increasing system size or circuit expressivity, the Gram matrix approaches the identity and suppresses informative similarity structure. We present an empirical study of two mitigation strategies implemented in Qiskit: (i) local (patch-wise) kernels that aggregate subsystem similarities, and (ii) multi-scale kernels that mix local and global similarity across patch granularities. We benchmark baseline, local, and multi-scale kernels under matched preprocessing, splits, and SVM protocols on several tabular datasets, sweeping the feature dimension $d\in\{4,6,\dots,20\}$. We report concentration diagnostics based on off-diagonal kernel statistics, spectral richness via effective rank, and centered alignment with labels. Across datasets, local and multi-scale constructions consistently mitigate concentration and yield richer kernel spectra relative to the global fidelity baseline, while the impact on classification accuracy depends on the dataset and dimension.
