Table of Contents
Fetching ...

On high-dimensional wavelet eigenanalysis

Patrice Abry, B. Cooper Boniece, Gustavo Didier, Herwig Wendt

Abstract

In this paper, we characterize the asymptotic and large scale behavior of the eigenvalues of wavelet random matrices in high dimensions. We assume that possibly non-Gaussian, finite-variance $p$-variate measurements are made of a low-dimensional $r$-variate ($r \ll p$) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise and between rows. We show that the $r$ largest eigenvalues of the wavelet random matrices, when appropriately rescaled, converge in probability to scale-invariant functions in the high-dimensional limit. By contrast, the remaining $p-r$ eigenvalues remain bounded in probability. Under additional assumptions, we show that the $r$ largest log-eigenvalues of wavelet random matrices exhibit asymptotically Gaussian distributions. The results have direct consequences for statistical inference.

On high-dimensional wavelet eigenanalysis

Abstract

In this paper, we characterize the asymptotic and large scale behavior of the eigenvalues of wavelet random matrices in high dimensions. We assume that possibly non-Gaussian, finite-variance -variate measurements are made of a low-dimensional -variate () fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise and between rows. We show that the largest eigenvalues of the wavelet random matrices, when appropriately rescaled, converge in probability to scale-invariant functions in the high-dimensional limit. By contrast, the remaining eigenvalues remain bounded in probability. Under additional assumptions, we show that the largest log-eigenvalues of wavelet random matrices exhibit asymptotically Gaussian distributions. The results have direct consequences for statistical inference.

Paper Structure

This paper contains 22 sections, 493 equations, 3 figures.

Figures (3)

  • Figure 1: The convergence of the rescaled wavelet log-eigenvalues in the three-way limit $\frac{p\space2^j}{n} \rightarrow c \in [0,\infty)$. In this simulation exercise, $X$ is an ofBm and $Z$ is a vector of Gaussian white noise processes (see Section \ref{['s:examples']} for a discussion of these models). $X$ and $Z$ are generated independently. For each $n$, the coordinates matrix ${\mathbf P} {\mathbf P}_H$ is randomly drawn based on i.i.d. standard Gaussian entries, and then normalized to have unit-norm columns. For notational simplicity, we reexpressed the scaling factor as $a(n) = 2^j$, where $j = j_n \rightarrow \infty$. For $r=6$ and $h_q \in \{0.1,0.3, 0.5,0.6,0.8,0.9\}$, $q = 1,\hdots,r$, the plots display the asymptotic behavior of $\frac{1}{2}[(\log \Lambda_{\ell}(2^j))/j-1]$, where $\Lambda_{\ell}(2^j):=\lambda_{\ell}(\mathbf{W}(2^j))$, $\ell = 1,\hdots,p$. The six dashed lines correspond to the values $\{0.1,0.3, 0.5,0.6,0.8,0.9\}$. In all plots, $p$, $n$ and $j$ increase while their ratio remains fixed at $p \space 2^j/n =: p/n_j= c = 1/2$ (left column) and $c=1/4$ (right column). In light of Theorem \ref{['t:lim_n_a_times_lambda/a^(2h+1)']}, for $j = j_n \rightarrow \infty$, $(1/j)\log \Lambda_{p-r+q}(2^j) \stackrel{{\Bbb P}}\rightarrow 2h_q+1$, $q=1,\hdots,6$, and $(1/j)\log \Lambda_{p-r}(2^j) \stackrel{{\Bbb P}}\rightarrow 0$ in the three-way limit. As expected, the $r=6$ largest $\frac{1}{2}[(\log \Lambda_{\ell}(2^j))/j-1]$ approach $h_1,\hdots,h_6$ in the plots as $n$, $p$ and $j$ grow. By contrast, the remaining $\frac{1}{2}[(\log \Lambda_{\ell}(2^j))/j-1]$ tend toward zero as $j$ increases in all instances. For a fixed $c$ (column), going from the top to the bottom row, the magnitudes of $p$ and $2^j$ are larger and smaller, respectively, by a factor of 2 for each value of $n$. As a result, we observe near-convergence at smaller octaves $j$ (n.b.: axes have been shifted to align the curves). Similarly, going from the left to the right column (i.e., as $c$ decreases), the near-convergence also occurs at smaller$j$, which is reflective of a less extreme high-dimensional regime $c$.
  • Figure 2: The fluctuations of wavelet log-eigenvalues in the three-way limit $\frac{p\space2^j}{n} \rightarrow c=1/2$. In the same simulation framework as for Figure \ref{['fig:logeig']}, Theorem \ref{['t:asympt_normality_lambdap-r+q']} predicts that the joint distribution of $\{\log \Lambda_{p-r+q}(2^j))\}_{q=1,\hdots,6}$ is asymptotically Gaussian in the three-way limit, after centering and rescaling. In fact, this convergence can be seen in the so-called Gamma plots displayed above, which are expected to look close to a straight line under joint Gaussianity. The plots show the empirical quantiles of the squared Mahalanobis distance statistic vs. the theoretical quantiles of a $\chi^2_6$ distribution based on 5000 realizations (e.g., Johnson and Wichern johnson:wichern:2002). The (effective) sample size $n_j=n/2^j$ increases from left-to-right and $p/n_j$ is set equal to $1/2.$ The plots also display Kolmogorov-Smirnov distance statistics $d_{KS}$, which tend to shrink for larger values of $n_j$.
  • Figure 3: Schematic representation of the behavior of wavelet eigenvectors vis-à-vis the coordinate vectors in the three-way limit \ref{['e:three-fold_lim']}. The plot depicts the case $r = 3$ and $r_1 = r_2 = r_3 = 1$ (i.e., $h_1 < h_2 < h_3$). It visually represents the asymptotic behavior of the angles among the high-dimensional vectors. In the limit, we observe that $\textnormal{span}_{\ell=1,2,3}\{{\mathbf u}_{p-r+\ell}(n)\}\sim \textnormal{span}_{\ell=1,2,3}\{{\mathbf p}_{\ell}(n)\}$, $|\langle {\mathbf u}_{p-1}(n),{\mathbf p}_r(n)\rangle| \sim 0$ and $\max\{|\langle {\mathbf u}_{p-2}(n),{\mathbf p}_r(n)\rangle|, |\langle {\mathbf u}_{p-2}(n),{\mathbf p}_{r-1}(n)\rangle| \}\sim 0$. In particular, $\langle {\mathbf u}_p(n),{\mathbf p}_r(n)\rangle^2 \sim 1$ and, approximately, ${\mathbf u}_{p-1}(n) \in \textnormal{span}\{{\mathbf p}_{r-1}(n),{\mathbf p}_r(n)\}$. In the general case, the former three relations are given by \ref{['e:subseq_condition_1_top_explanation']}, \ref{['e:subseq_condition_2_top_explanation']} and \ref{['e:subseq_condition_2_top_explanation-2']}, respectively.