Asymptotic analysis of the Gaussian kernel matrix for partially noisy data in high dimensions

Kensuke Aishima

Asymptotic analysis of the Gaussian kernel matrix for partially noisy data in high dimensions

Kensuke Aishima

TL;DR

This work analyzes the Gaussian kernel matrix in high dimensions under partial noise, building on Karoui’s results that eigenvectors are consistent while eigenvalues may be inconsistent. It combines this asymptotic structure with constrained low-rank approximations to construct strong-consistency estimators, including a rank-deficient correction using the smallest eigenvalue and a block-elimination-based estimator for partially noisy data. The key contributions are: (i) a precise asymptotic relation for eigenvalues and proven strong consistency of eigenvectors; (ii) a simple, robust estimator that recovers the limiting kernel in rank-deficient settings; (iii) an extended noise-model treatment with a structured estimator that remains consistent under partial noisiness. The results enhance robust reconstruction and spectral analysis of kernel matrices in high-dimensional, noisy regimes, with potential impact on kernel methods in data science and scientific computing.

Abstract

The Gaussian kernel is one of the most important kernels, applicable to many research fields, including scientific computing and data science. In this paper, we present asymptotic analysis of the Gaussian kernel matrix in high dimension under a statistical model of noisy data. The main result is a nice combination of Karoui's asymptotic analysis with procedures of constrained low rank matrix approximations. More specifically, Karouli clarified an important asymptotic structure of the Gaussian kernel matrix, leading to strong consistency of the eigenvectors, though the eigenvalues are inconsistent. This paper focuses on the above results and presents a consistent estimator with the use of the smallest eigenvalue, whenever the target kernel matrix tends to low rank in the asymptotic regime. Importantly, asymptotic analysis is given under a statistical model representing partial noise. Although a naive estimator is inconsistent, applying an optimization method for low rank approximations with constraints, we overcome the difficulty caused by the inconsistency, resulting in a new estimator with strong consistency in rank deficient cases.

Asymptotic analysis of the Gaussian kernel matrix for partially noisy data in high dimensions

TL;DR

Abstract

Paper Structure (10 sections, 9 theorems, 55 equations)

This paper contains 10 sections, 9 theorems, 55 equations.

Introduction
Problem setting for consistency analysis in high dimension
Estimator with strong consistency for the Gaussian kernel matrix for high dimensional noisy data
Asymptotic analysis: strong consistency of invariant subspace
Modified estimator for reconstructions of rank deficient matrices
Proposed consistent estimator with asymptotic analysis for a model representing partial noise
Existing optimization method based on the Gaussian elimination for low rank approximations
Asymptotic analysis of matrices in the optimization method
Proposed consistent estimator for rank deficient cases
Conclusion

Key Result

Proposition 1

Assume that $\hbox{\boldmath $\xi$}_{1},\ldots ,\hbox{\boldmath $\xi$}_{n}$ are independent and identically distributed (i.i.d.) and all $n$ mean vectors are $\hbox{\boldmath $0$} \in \mathbb{R}^{d}$. Let ${\sigma_{1}}^{2},\ldots ,{\sigma_{d}}^{2}$ denote the second moments of the corresponding elem

Theorems & Definitions (18)

Proposition 1
proof
Lemma 1
proof
Theorem 1
proof
Corollary 1
Theorem 2
proof
Remark 1
...and 8 more

Asymptotic analysis of the Gaussian kernel matrix for partially noisy data in high dimensions

TL;DR

Abstract

Asymptotic analysis of the Gaussian kernel matrix for partially noisy data in high dimensions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (18)