Phase transition for conditional covariance matrices estimated by importance sampling, and implications for cross-entropy schemes in high dimension
Jason Beh, Jerome Morio, Florian Simatos
TL;DR
The paper analyzes high-dimensional covariance estimation in cross-entropy schemes via a random-matrix model with dependent, heavy-tailed weights, proving a phase transition in the polynomial regime $n = d^\kappa$ governed by a threshold $κ_*$. The threshold is tied to the tail behavior of likelihood ratios and to the smallest eigenvalue $λ_{\min}(Σ)$ of the auxiliary covariance; in particular, $κ_* = 1/λ_1$ in the bad projection case $V \subset U_\perp$, and $1 \le κ_* \le 1/λ_1$ in the good case $V \subset U$. The authors connect this spectral viewpoint to CE schemes with projection, showing that larger $λ_1$ (and thus larger $λ_{\min}$ of projected covariance) yields more stable and accurate estimators, a finding supported by numerical experiments across several test functions. The results offer a spectral criterion for designing efficient high-dimensional CE algorithms and open directions for integrating projection strategies with broader CE frameworks and advanced subspace methods.
Abstract
Motivated by the estimation of covariance matrices by importance sampling arising in the cross-entropy (CE) algorithm, we study a random matrix model $\hat Σ= {\bf X} L {\bf X}^\top$ with two distinct features: $\bf X$ and $L$ are dependent, and $L$ is heavy-tailed. In the high-dimensional regime $d \to \infty$, we prove under suitable assumptions that a phase transition occurs in the polynomial regime $n = d^κ$, with $n$ the sample size. Namely, we prove that $\lVert \hat Σ- E \hat Σ\rVert \Rightarrow 0$ if and only if $κ> κ_*$ for some threshold $κ_*$ determined by the behavior of the maximum likelihood ratios. Moreover, we identify general situations where $κ_* = 1/λ_1$, with $λ_1$ the smallest eigenvalue of the covariance matrix of the auxiliary distribution used to estimate $\hat Σ$ by importance sampling. This suggests that importance sampling will work better with covariance matrices having a large smallest eigenvalue. We carry this insight into recent CE schemes proposed to estimate the probability of high-dimensional rare events. Through numerical simulations, we demonstrate that better CE schemes are also the ones with larger smallest eigenvalue, even though these algorithms were not designed to smooth the spectrum. This new spectral interpretation raises stimulating questions and opens research directions for the design of efficient high-dimensional algorithms.
