Table of Contents
Fetching ...

Spectrally-Corrected and Regularized QDA Classifier for Spiked Covariance Model

Wenya Luo, Hua Li, Zhidong Bai, Zhijun Liu

TL;DR

This work tackles the instability of QDA in high dimensions by introducing SR-QDA, which spectrally corrects class covariances under a spiked covariance model and applies two regularization terms tuned by maximizing the Fisher discriminant ratio $\rho(\gamma)$. It develops large-dimensional asymptotic theory showing the Fisher-ratio objective converges to a deterministic limit, provides (in closed form) asymptotically optimal regularization parameters based on spike structure, and presents a bias-corrected noise-variance estimator with a CLT. Theoretical results are complemented by practical estimation procedures and convergence guarantees, and SR-QDA is shown to outperform QDA, R-QDA, Im-QDA, SVM, and KNN in simulations and real datasets, especially in moderate-to-high dimensional settings with limited samples. Overall, SR-QDA offers a principled, scalable approach for quadratic discrimination in spiked-covariance regimes with tangible improvements in classification accuracy for high-dimensional problems.

Abstract

Quadratic discriminant analysis (QDA) is a widely used method for classification problems, particularly preferable over Linear Discriminant Analysis (LDA) for heterogeneous data. However, QDA loses its effectiveness in high-dimensional settings, where the data dimension and sample size tend to infinity. To address this issue, we propose a novel QDA method utilizing spectral correction and regularization techniques, termed SR-QDA. The regularization parameters in our method are selected by maximizing the Fisher-discriminant ratio. We compare SR-QDA with QDA, regularized quadratic discriminant analysis (R-QDA), and several other competitors. The results indicate that SR-QDA performs exceptionally well, especially in moderate and high-dimensional situations. Empirical experiments across diverse datasets further support this conclusion.

Spectrally-Corrected and Regularized QDA Classifier for Spiked Covariance Model

TL;DR

This work tackles the instability of QDA in high dimensions by introducing SR-QDA, which spectrally corrects class covariances under a spiked covariance model and applies two regularization terms tuned by maximizing the Fisher discriminant ratio . It develops large-dimensional asymptotic theory showing the Fisher-ratio objective converges to a deterministic limit, provides (in closed form) asymptotically optimal regularization parameters based on spike structure, and presents a bias-corrected noise-variance estimator with a CLT. Theoretical results are complemented by practical estimation procedures and convergence guarantees, and SR-QDA is shown to outperform QDA, R-QDA, Im-QDA, SVM, and KNN in simulations and real datasets, especially in moderate-to-high dimensional settings with limited samples. Overall, SR-QDA offers a principled, scalable approach for quadratic discrimination in spiked-covariance regimes with tangible improvements in classification accuracy for high-dimensional problems.

Abstract

Quadratic discriminant analysis (QDA) is a widely used method for classification problems, particularly preferable over Linear Discriminant Analysis (LDA) for heterogeneous data. However, QDA loses its effectiveness in high-dimensional settings, where the data dimension and sample size tend to infinity. To address this issue, we propose a novel QDA method utilizing spectral correction and regularization techniques, termed SR-QDA. The regularization parameters in our method are selected by maximizing the Fisher-discriminant ratio. We compare SR-QDA with QDA, regularized quadratic discriminant analysis (R-QDA), and several other competitors. The results indicate that SR-QDA performs exceptionally well, especially in moderate and high-dimensional situations. Empirical experiments across diverse datasets further support this conclusion.

Paper Structure

This paper contains 8 sections, 8 theorems, 106 equations, 2 figures, 3 tables.

Key Result

Theorem 1

Under Assumption as:1 to as:5 , we have where with where $\gamma^{(1)}_{j,i}=\frac{\gamma_{1,i}\lambda_{j,i}}{1+\gamma_{1,i}\lambda_{j,i}}$, $\gamma_{j,i}^{(2)}=\frac{\gamma_{2,i}\lambda_{j,i}}{1+\gamma_{2,i}\lambda_{j,i}}$, $i=0, 1$.

Figures (2)

  • Figure 1: Accuracy rate vs. training sample size $n$ for $p=150$ and $\pi_0=0.5$. Comparison for QDA, RLDA, ILDA and SRLDA with different values of $\sigma_1^2$.
  • Figure 2: Accuracy rate vs. training sample size for $p=40$. Comparison between the proposed SR-QDA classifier, SVM and KNN using Epileptic Seizure Detection Database of class $4$ and $5$.

Theorems & Definitions (19)

  • Remark 1
  • Remark 2
  • Theorem 1
  • Theorem 2
  • Remark 3
  • Theorem 3
  • proof
  • Theorem 4
  • Theorem 5
  • Remark 4
  • ...and 9 more