Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels

Claudia Zendejas-Morales; Debashis Saikia; Utkarsh Singh

Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels

Claudia Zendejas-Morales, Debashis Saikia, Utkarsh Singh

TL;DR

This work addresses the problem of exponential concentration in fidelity-based quantum kernels, where the Gram matrix collapses toward the identity as system size or circuit expressivity grows. It introduces two practical mitigation strategies—local patch-wise kernels and multi-scale kernel mixtures—implemented within Qiskit, and evaluates them across real and synthetic tabular datasets while sweeping the feature dimension $d$ up to 20. By examining diagnostics such as $p50$/$p95$ off-diagonal concentration, $r_{eff}$, and centered alignment with labels, the study shows that locality and scale mixing consistently reshape kernel geometry and yield a richer spectral structure, though improvements in SVM accuracy are dataset-dependent. The findings contribute actionable insights for designing quantum kernel pipelines, suggesting that locality and multiscale mixing can reduce concentration and expand informative kernel directions, with practical implications for scalable quantum-classical learning and hardware-aware deployments.

Abstract

Fidelity-based quantum kernels provide a direct interface between quantum feature maps and classical kernel methods, but they can exhibit exponential concentration: with increasing system size or circuit expressivity, the Gram matrix approaches the identity and suppresses informative similarity structure. We present an empirical study of two mitigation strategies implemented in Qiskit: (i) local (patch-wise) kernels that aggregate subsystem similarities, and (ii) multi-scale kernels that mix local and global similarity across patch granularities. We benchmark baseline, local, and multi-scale kernels under matched preprocessing, splits, and SVM protocols on several tabular datasets, sweeping the feature dimension $d\in\{4,6,\dots,20\}$. We report concentration diagnostics based on off-diagonal kernel statistics, spectral richness via effective rank, and centered alignment with labels. Across datasets, local and multi-scale constructions consistently mitigate concentration and yield richer kernel spectra relative to the global fidelity baseline, while the impact on classification accuracy depends on the dataset and dimension.

Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels

TL;DR

up to 20. By examining diagnostics such as

off-diagonal concentration,

, and centered alignment with labels, the study shows that locality and scale mixing consistently reshape kernel geometry and yield a richer spectral structure, though improvements in SVM accuracy are dataset-dependent. The findings contribute actionable insights for designing quantum kernel pipelines, suggesting that locality and multiscale mixing can reduce concentration and expand informative kernel directions, with practical implications for scalable quantum-classical learning and hardware-aware deployments.

Abstract

. We report concentration diagnostics based on off-diagonal kernel statistics, spectral richness via effective rank, and centered alignment with labels. Across datasets, local and multi-scale constructions consistently mitigate concentration and yield richer kernel spectra relative to the global fidelity baseline, while the impact on classification accuracy depends on the dataset and dimension.

Paper Structure (56 sections, 20 equations, 10 figures, 2 algorithms)

This paper contains 56 sections, 20 equations, 10 figures, 2 algorithms.

Introduction
Methods
Kernel definitions
Baseline (global fidelity) kernel
Local (patch-wise) kernels
Subcircuit-based patch kernel.
Reduced Density Matrix (RDM) local kernel.
Patch aggregation.
Multi-scale kernels
Normalization and PSD correction
Local kernel.
Baseline and multi-scale kernels.
Feature maps and implementation
ZZ-style feature maps.
Depth and entanglement patterns.
...and 41 more sections

Figures (10)

Figure 1: Off-diagonal concentration (p50) vs. feature dimension $d$ for all datasets. The baseline fidelity kernel concentrates rapidly as $d$ increases (p50 $\to 0$), while the local kernel maintains substantially larger off-diagonal similarities; multi-scale is typically intermediate. All kernels are unit-diagonal normalized.
Figure 2: Entropy-based effective rank $r_{\mathrm{eff}}(K)$ vs. feature dimension $d$. Local kernels generally preserve a richer spectrum (higher effective rank) than the baseline fidelity kernel; multi-scale typically lies between baseline and local.
Figure 3: SVM test accuracy vs. feature dimension $d$. For each dataset and $d$, the SVM regularization parameter $C$ is selected by validation accuracy from the fixed grid in Eq. \ref{['eq:svm_C_grid']}, then evaluated once on the test split. Accuracy gains from local and multi-scale kernels are dataset dependent.
Figure 4: Heatmaps of test-accuracy deltas relative to baseline across datasets (rows) and feature dimensions $d$ (columns).
Figure 5: Mean test-accuracy delta relative to the baseline kernel, reported per dataset and averaged across feature dimensions $d\in\{4,6,\dots,20\}$. Positive values indicate that the local or multi-scale kernel improves accuracy on average, while negative values indicate a decrease relative to baseline.
...and 5 more figures

Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels

TL;DR

Abstract

Local and Multi-Scale Strategies to Mitigate Exponential Concentration in Quantum Kernels

Authors

TL;DR

Abstract

Table of Contents

Figures (10)