Table of Contents
Fetching ...

Spectral analysis of large dimensional Chatterjee's rank correlation matrix

Zhaorui Dong, Fang Han, Jianfeng Yao

Abstract

This paper studies the spectral behavior of large dimensional Chatterjee's rank correlation matrix when observations are independent draws from a high-dimensional random vector with independent continuous components. We show that the empirical spectral distribution of its symmetrized version converges to the semicircle law, and thus providing the first example of a large correlation matrix deviating from the Marchenko-Pastur law that governs those of Pearson, Kendall, and Spearman. We further establish central limit theorems for linear spectral statistics, which in turn enable the development of Chatterjee's rank correlation-based tests of complete independence among the components.

Spectral analysis of large dimensional Chatterjee's rank correlation matrix

Abstract

This paper studies the spectral behavior of large dimensional Chatterjee's rank correlation matrix when observations are independent draws from a high-dimensional random vector with independent continuous components. We show that the empirical spectral distribution of its symmetrized version converges to the semicircle law, and thus providing the first example of a large correlation matrix deviating from the Marchenko-Pastur law that governs those of Pearson, Kendall, and Spearman. We further establish central limit theorems for linear spectral statistics, which in turn enable the development of Chatterjee's rank correlation-based tests of complete independence among the components.

Paper Structure

This paper contains 31 sections, 27 theorems, 262 equations, 3 figures, 3 tables.

Key Result

Theorem 1.1

Under Assumption assump:dgp and the asymptotic regime eq:asymptotics, we have where $\bm{\Phi}_n$ was introduced in eq:phin.

Figures (3)

  • Figure 1: Semicircle law of $\mathbf{\Phi}_n$ with $p=100$ and $n=200$.
  • Figure 2: M-P law of $\bm{\Psi}_n$ with $p=100$ and $n=200$.
  • Figure 3: Histograms of $\operatorname{tr}(\mathbf{\Psi}_n^k)$ centered by sample means, with $p=500$, $n=200$, and over 500 replications.

Theorems & Definitions (37)

  • Theorem 1.1: Semicircle law for $\bm{\Phi}_n$
  • Theorem 1.2: M-P law for $\bm{\Psi}_n$
  • Theorem 1.3: CLT for LSS of $\bm{\Psi}_n$
  • Proposition 2.1: MR4185806
  • Remark 2.1
  • Proposition 2.2: Dependence structure of relative ranks
  • Corollary 2.1
  • Proposition 2.3
  • Proposition 2.4
  • Proposition 4.1
  • ...and 27 more