Table of Contents
Fetching ...

Correlation tests and sample spectral coherence matrix in the high-dimensional regime

Philippe Loubaton, Alexis Rosuel, Pascal Vallet

TL;DR

The paper tackles high-dimensional testing of independence among components of a complex Gaussian time series by analyzing the linear spectral statistics of the frequency-smoothed spectral coherence estimator. It develops a rigorous CLT framework in the regime $M=O(N^{\alpha})$, with smoothing span $B$ satisfying $c_N=M/B\to c\in(0,1)$, and provides two frequency-grid statistics with asymptotic normality under $\mathcal{H}_0$. Leveraging Bartlett's factorization, Stieltjes transform tools, and Gaussian concentration, the authors derive representations for $\theta_N(f,\nu)$ and show that $\frac{B\theta_N(f,\nu)}{\sigma_N(f)} \to_d \mathcal{N}(0,1)$ and that the aggregated statistics $\zeta_{N,1}(f)$ and $\zeta_{N,2}(f)$ also converge to Gaussian limits (or Chi-square-type limits for the squares). They further show that replacing the unknown $r_N(\nu)$ with a practical estimator $\hat{r}_N(\nu)$ preserves the CLTs, enabling controlled Type I error and power analyses. Numerical simulations across multiple DGPs, including non-Gaussian innovations and spatial dependence, demonstrate good finite-sample behavior and robustness relative to PGY-type methods. The results provide actionable, spectrally-informed tests for independence in ultra-high-dimensional time series, with clear guidance on regime validity and practical estimation.

Abstract

It is established that the linear spectral statistics (LSS) of the smoothed periodogram estimate of the spectral coherence matrix of a complex Gaussian high-dimensional times series (yn) n$\in$Z with independent components satisfy at each frequency a central limit theorem in the asymptotic regime where the sample size N , the dimension M of the observation, and the smoothing span B both converge towards +$\infty$ in such a way that M = O(N $α$ ) for $α$ < 1 and M B $\rightarrow$ c, c $\in$ (0, 1). It is deduced that two recentered and renormalized versions of the LSS, one based on an average in the frequency domain and the other one based on a sum of squares also in the frequency domain, and both evaluated over a well-chosen frequency grid, also verify a central limit theorem. These two statistics are proposed to test with controlled asymptotic level the hypothesis that the components of y are independent. Numerical simulations assess the performance of the two tests.

Correlation tests and sample spectral coherence matrix in the high-dimensional regime

TL;DR

The paper tackles high-dimensional testing of independence among components of a complex Gaussian time series by analyzing the linear spectral statistics of the frequency-smoothed spectral coherence estimator. It develops a rigorous CLT framework in the regime , with smoothing span satisfying , and provides two frequency-grid statistics with asymptotic normality under . Leveraging Bartlett's factorization, Stieltjes transform tools, and Gaussian concentration, the authors derive representations for and show that and that the aggregated statistics and also converge to Gaussian limits (or Chi-square-type limits for the squares). They further show that replacing the unknown with a practical estimator preserves the CLTs, enabling controlled Type I error and power analyses. Numerical simulations across multiple DGPs, including non-Gaussian innovations and spatial dependence, demonstrate good finite-sample behavior and robustness relative to PGY-type methods. The results provide actionable, spectrally-informed tests for independence in ultra-high-dimensional time series, with clear guidance on regime validity and practical estimation.

Abstract

It is established that the linear spectral statistics (LSS) of the smoothed periodogram estimate of the spectral coherence matrix of a complex Gaussian high-dimensional times series (yn) nZ with independent components satisfy at each frequency a central limit theorem in the asymptotic regime where the sample size N , the dimension M of the observation, and the smoothing span B both converge towards + in such a way that M = O(N ) for < 1 and M B c, c (0, 1). It is deduced that two recentered and renormalized versions of the LSS, one based on an average in the frequency domain and the other one based on a sum of squares also in the frequency domain, and both evaluated over a well-chosen frequency grid, also verify a central limit theorem. These two statistics are proposed to test with controlled asymptotic level the hypothesis that the components of y are independent. Numerical simulations assess the performance of the two tests.
Paper Structure (81 sections, 31 theorems, 636 equations, 4 figures, 8 tables)

This paper contains 81 sections, 31 theorems, 636 equations, 4 figures, 8 tables.

Key Result

Lemma 2.5

If the set $U^{(N)}$ is given by $U^{(N)} = \{ 1,2, \ldots, M(N) \} \times V^{(N)}$ for a certain set $V^{(N)}$ and for $M(N) \leq N$, then if $X^{(N)} = \mathcal{O}_{\prec}(a_N)$ (respectively $X^{(N)} = o_{\prec}(a_N)$), the family $Y^{(N)}(v), v \in V^{(N)}$ defined by verifies $Y^{(N)} = \mathcal{O}_{\prec}(a_N)$ (resp. $Y^{(N)} = o_{\prec}(a_N)$).

Figures (4)

  • Figure 1: $\xi_{N,1}, \xi_{N,2}$ (against $\mathcal{N}(0,1)$ or a $\chi^2(|\mathcal{G}_N|)$ random variable) and $\xi_{N,3}$ against their respective limiting distributions: histograms (left) and qq-plots (right). Data generated by DGP 1, $N=10^4, B = 301, M=120, L=5$. $\phi_m=0.1$ and $\psi_m=0.5$ for all $m$. $f: x \mapsto (x-1)^2$. $10^4$ repetitions.
  • Figure 2: Sample mean (left) and standard deviation (right) of $\xi_{N,0}(f, \nu)$ over $10^4$ repetitions as a function of $\alpha$ and $N$. Data generated by DGP 1, $c=1/2$, $\phi_m=0.1$ and $\psi_m=0.5$ for all $m$. $f: x \mapsto (x-1)^2$. $\nu=0.3$. $r_N(\nu)$ is assumed to be known.
  • Figure 3: Sample mean (left) and standard deviation (right) of $\xi_{N,1}(f)$ over $10^4$ repetitions as a function of $\alpha$ and $N$. Data generated by DGP 1, $c=1/2$, $\phi_m=0.1$ and $\psi_m=0.5$ for all $m$. $f: x \mapsto (x-1)^2$. $r_N(\nu)$ is assumed to be known.
  • Figure 4: Empirical distributions of $\xi_{N,1}$ (against a $\mathcal{N}(0,1)$), $\xi_{N,2}$ (against a $\mathcal{N}(0,1)$ and a $\chi^2(|\mathcal{G}_N|)$) and $\xi_{N,3}$ (against a Gumbel random variable). Data generated by DGP 1 with non-Gaussian innovations (complex Student's distribution on the left, K-distribution on the right), $N=10^4, M=120, B = 301, L=5$. $\phi_m=0.1$ and $\psi_m=0.5$ for all $m$. $f: x \mapsto (x-1)^2$. $10^4$ repetitions.

Theorems & Definitions (59)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Lemma 2.5
  • proof
  • Definition 2.6
  • Proposition 2.7
  • Proposition 2.8
  • Lemma 2.9
  • Remark 2.10
  • ...and 49 more