Asymptotic Behavior of Principal Component Projections for Multivariate Extremes
Holger Drees
TL;DR
This work develops a PCA-based approach for estimating the extremal dependence structure of a regularly varying $d$-dimensional vector via its angular measure. By projecting extreme observations onto a data-driven lower-dimensional subspace, the authors derive asymptotic results for the PCA projection and the resulting excess risk, including explicit limit distributions when the limit projection is unique. They establish local empirical-process results around the optimal projection and propose a dimension-selection rule that adapts the projection dimension to the data. Simulation studies demonstrate potential gains in finite samples across diverse high-dimensional extreme-value models, suggesting practical benefits for estimating angular measures in moderate-to-high dimensions. Overall, the paper provides a rigorous asymptotic framework and actionable methodology for dimension reduction in multivariate extremes with angular-dependence structure.
Abstract
The extremal dependence structure of a regularly varying $d$-dimensional random vector can be described by its angular measure. The standard nonparametric estimator of this measure is the empirical measure of the observed angles of the $k$ random vectors with largest norm, for a suitably chosen number $k$. Due to the curse of dimensionality, for moderate or large $d$, this estimator is often inaccurate. If the angular measure is concentrated on a vicinity of a lower dimensional subspace, then first projecting the data on a lower dimensional subspace obtained by a principal component analysis of the angles of extreme observations can substantially improve the performance of the estimator. We derive the asymptotic behavior of such PCA projections and the resulting excess risk. In particular, it is shown that, under mild conditions, the excess risk (as a function of $k$) decreases much faster than it was suggested by empirical risk bounds obtained in \cite{DS21}. Moreover, functional limit theorems for local empirical processes of the (empirical) reconstruction error of projections uniformly over neighborhoods of the true optimal projection are established. Based on these asymptotic results, we propose a data-driven method to select the dimension of the projection space. Finally, the finite sample performance of resulting estimators is examined in a simulation study.
