Table of Contents
Fetching ...

Classification of high-dimensional data with spiked covariance matrix structure

Yin-Jen Chen, Minh Tang

TL;DR

This work tackles high-dimensional binary classification under a spiked covariance structure by pairing PCA-based dimension reduction with Fisher LDA in a whitened space. The key idea is to first whiten the data using an estimated Σ^{-1/2}, then screen the whitened discriminant coordinates by sparsity level s, before applying LDA in the reduced feature set. The authors prove that, under n → ∞ and n^{-1} ln p → 0 with a fixed number of spikes d and sparsity s, the resulting lda ∘ pca classifier is Bayes-optimal, and extend the results to QDA with analogous guarantees. Numerical experiments on simulated and real data show competitive performance relative to state-of-the-art high-dimensional classifiers while operating in a much lower-dimensional representation, highlighting the practical value of combining PCA-based reduction with discriminant analysis under spike covariance. The framework also accommodates extensions to multi-class settings and heterogeneous covariance structures, and is robust to elliptical distributions, broadening its applicability in modern high-dimensional problems.

Abstract

We study the classification problem for high-dimensional data with $n$ observations on $p$ features where the $p \times p$ covariance matrix $Σ$ exhibits a spiked eigenvalue structure and the vector $ζ$, given by the difference between the {\em whitened} mean vectors, is sparse. We analyze an adaptive classifier (adaptive with respect to the sparsity $s$) that first performs dimension reduction on the feature vectors prior to classification in the dimensionally reduced space, i.e., the classifier whitens the data, then screens the features by keeping only those corresponding to the $s$ largest coordinates of $ζ$ and finally applies Fisher linear discriminant on the selected features. Leveraging recent results on entrywise matrix perturbation bounds for covariance matrices, we show that the resulting classifier is Bayes optimal whenever $n \rightarrow \infty$ and $s \sqrt{n^{-1} \ln p} \rightarrow 0$. Notably, our theory also guarantees Bayes optimality for the corresponding quadratic discriminant analysis (QDA). Experimental results on real and synthetic data further indicate that the proposed approach is competitive with state-of-the-art methods while operating on a substantially lower-dimensional representation.

Classification of high-dimensional data with spiked covariance matrix structure

TL;DR

This work tackles high-dimensional binary classification under a spiked covariance structure by pairing PCA-based dimension reduction with Fisher LDA in a whitened space. The key idea is to first whiten the data using an estimated Σ^{-1/2}, then screen the whitened discriminant coordinates by sparsity level s, before applying LDA in the reduced feature set. The authors prove that, under n → ∞ and n^{-1} ln p → 0 with a fixed number of spikes d and sparsity s, the resulting lda ∘ pca classifier is Bayes-optimal, and extend the results to QDA with analogous guarantees. Numerical experiments on simulated and real data show competitive performance relative to state-of-the-art high-dimensional classifiers while operating in a much lower-dimensional representation, highlighting the practical value of combining PCA-based reduction with discriminant analysis under spike covariance. The framework also accommodates extensions to multi-class settings and heterogeneous covariance structures, and is robust to elliptical distributions, broadening its applicability in modern high-dimensional problems.

Abstract

We study the classification problem for high-dimensional data with observations on features where the covariance matrix exhibits a spiked eigenvalue structure and the vector , given by the difference between the {\em whitened} mean vectors, is sparse. We analyze an adaptive classifier (adaptive with respect to the sparsity ) that first performs dimension reduction on the feature vectors prior to classification in the dimensionally reduced space, i.e., the classifier whitens the data, then screens the features by keeping only those corresponding to the largest coordinates of and finally applies Fisher linear discriminant on the selected features. Leveraging recent results on entrywise matrix perturbation bounds for covariance matrices, we show that the resulting classifier is Bayes optimal whenever and . Notably, our theory also guarantees Bayes optimality for the corresponding quadratic discriminant analysis (QDA). Experimental results on real and synthetic data further indicate that the proposed approach is competitive with state-of-the-art methods while operating on a substantially lower-dimensional representation.

Paper Structure

This paper contains 27 sections, 12 theorems, 117 equations, 15 tables, 4 algorithms.

Key Result

Theorem 1

Let $\mathbf{X}$ be a $n \times p$ matrix where the rows $X_1, \dots, X_n$ are i.i.d samples from $\pi_1 \mathcal{N}(\mu_1, \Sigma) + (1 - \pi_1) \mathcal{N}(\mu_2, \Sigma)$ and $\Sigma$ satisfies spiked_covariance, n_order_Divergent, and Bounded_Coherence. Let $\hat{\mathcal{U}}$ be the matrix of e

Theorems & Definitions (21)

  • Remark 1
  • Remark 2
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Definition 1
  • ...and 11 more