Sparsity meets correlation in Gaussian sequence model
Subhodh Kotekal, Chao Gao
TL;DR
This work derives sharp minimax rates for estimating an $s$-sparse signal in a $p$-dimensional Gaussian sequence model with equicorrelated observations, revealing a phase transition driven by $p-2s$ and a nontrivial dependence on the correlation level $\gamma$. The authors decompose the problem into estimating the high-dimensional projection $\theta-\bar{\theta}\mathbf{1}_p$ via sparse regression and estimating the linear functional $\bar{\theta}$ using a kernel mode estimator, with oversmoothing (bandwidth widening) exploiting Gaussian structure to achieve optimal rates. They establish a precise piecewise rate formula $\varepsilon^*(p,s,\gamma)^2$, prove matching lower bounds, and develop adaptive procedures that achieve the minimax rate without knowledge of $s$ or $\gamma$, including a correlation estimator and Lepski-type tuning. The results illuminate when correlation is a blessing versus a curse and have implications for large-scale inference under dependency, robust estimation of location parameters, and extensions to multi-block correlation structures.
Abstract
We study estimation of an $s$-sparse signal in the $p$-dimensional Gaussian sequence model with equicorrelated observations and derive the minimax rate. A new phenomenon emerges from correlation, namely the rate scales with respect to $p-2s$ and exhibits a phase transition at $p-2s \asymp \sqrt{p}$. Correlation is shown to be a blessing provided it is sufficiently strong, and the critical correlation level exhibits a delicate dependence on the sparsity level. Due to correlation, the minimax rate is driven by two subproblems: estimation of a linear functional (the average of the signal) and estimation of the signal's $(p-1)$-dimensional projection onto the orthogonal subspace. The high-dimensional projection is estimated via sparse regression and the linear functional is cast as a robust location estimation problem. Existing robust estimators turn out to be suboptimal, and we show a kernel mode estimator with a widening bandwidth exploits the Gaussian character of the data to achieve the optimal estimation rate.
