Sparsity meets correlation in Gaussian sequence model

Subhodh Kotekal; Chao Gao

Sparsity meets correlation in Gaussian sequence model

Subhodh Kotekal, Chao Gao

TL;DR

This work derives sharp minimax rates for estimating an $s$-sparse signal in a $p$-dimensional Gaussian sequence model with equicorrelated observations, revealing a phase transition driven by $p-2s$ and a nontrivial dependence on the correlation level $\gamma$. The authors decompose the problem into estimating the high-dimensional projection $\theta-\bar{\theta}\mathbf{1}_p$ via sparse regression and estimating the linear functional $\bar{\theta}$ using a kernel mode estimator, with oversmoothing (bandwidth widening) exploiting Gaussian structure to achieve optimal rates. They establish a precise piecewise rate formula $\varepsilon^*(p,s,\gamma)^2$, prove matching lower bounds, and develop adaptive procedures that achieve the minimax rate without knowledge of $s$ or $\gamma$, including a correlation estimator and Lepski-type tuning. The results illuminate when correlation is a blessing versus a curse and have implications for large-scale inference under dependency, robust estimation of location parameters, and extensions to multi-block correlation structures.

Abstract

We study estimation of an $s$-sparse signal in the $p$-dimensional Gaussian sequence model with equicorrelated observations and derive the minimax rate. A new phenomenon emerges from correlation, namely the rate scales with respect to $p-2s$ and exhibits a phase transition at $p-2s \asymp \sqrt{p}$. Correlation is shown to be a blessing provided it is sufficiently strong, and the critical correlation level exhibits a delicate dependence on the sparsity level. Due to correlation, the minimax rate is driven by two subproblems: estimation of a linear functional (the average of the signal) and estimation of the signal's $(p-1)$-dimensional projection onto the orthogonal subspace. The high-dimensional projection is estimated via sparse regression and the linear functional is cast as a robust location estimation problem. Existing robust estimators turn out to be suboptimal, and we show a kernel mode estimator with a widening bandwidth exploits the Gaussian character of the data to achieve the optimal estimation rate.

Sparsity meets correlation in Gaussian sequence model

TL;DR

This work derives sharp minimax rates for estimating an

-sparse signal in a

-dimensional Gaussian sequence model with equicorrelated observations, revealing a phase transition driven by

and a nontrivial dependence on the correlation level

. The authors decompose the problem into estimating the high-dimensional projection

via sparse regression and estimating the linear functional

using a kernel mode estimator, with oversmoothing (bandwidth widening) exploiting Gaussian structure to achieve optimal rates. They establish a precise piecewise rate formula

, prove matching lower bounds, and develop adaptive procedures that achieve the minimax rate without knowledge of

, including a correlation estimator and Lepski-type tuning. The results illuminate when correlation is a blessing versus a curse and have implications for large-scale inference under dependency, robust estimation of location parameters, and extensions to multi-block correlation structures.

Abstract

We study estimation of an

-sparse signal in the

-dimensional Gaussian sequence model with equicorrelated observations and derive the minimax rate. A new phenomenon emerges from correlation, namely the rate scales with respect to

and exhibits a phase transition at

. Correlation is shown to be a blessing provided it is sufficiently strong, and the critical correlation level exhibits a delicate dependence on the sparsity level. Due to correlation, the minimax rate is driven by two subproblems: estimation of a linear functional (the average of the signal) and estimation of the signal's

-dimensional projection onto the orthogonal subspace. The high-dimensional projection is estimated via sparse regression and the linear functional is cast as a robust location estimation problem. Existing robust estimators turn out to be suboptimal, and we show a kernel mode estimator with a widening bandwidth exploits the Gaussian character of the data to achieve the optimal estimation rate.

Paper Structure (59 sections, 79 theorems, 393 equations, 4 figures)

This paper contains 59 sections, 79 theorems, 393 equations, 4 figures.

Introduction
Related work
A preview of the interaction between sparsity and correlation
The "decorrelate-then-regress" strategy fails
Main contribution
Notation
Estimation of a projection: sparse regression
Estimation of a linear functional: kernel mode estimator
An illustration in a special case
A widening, instead of shrinking, bandwidth
Connection to robust statistics
Upper bound
Lower bound
Regime 1 <= s <= p/2 - sqrt(p)
Regime p/2 - sqrt(p) < s < p/2
...and 44 more sections

Key Result

Proposition 1

If $1 \leq s \leq p$ and $\gamma \in [0, 1]$, then for any $\delta \in (0, 1)$ we have

Figures (4)

Figure 1: Plots of the rate $\varepsilon^*(p, s, \gamma)^2$ against $s$ with $p = 100$ and $\gamma = 1-p^{-\kappa}$ for various choices of $\kappa$.
Figure 2: Plots of $G_h$ and $J_h$ with $\mu = -2, \eta = 2, h = 0.25, p = 10000, s = \left\lfloor \frac{p}{2} - 10\sqrt{p}\right\rfloor = 4000, m \approx -1.999$.
Figure 3: Plots of $G_{\text{Huber}, h}$ and $J_{\text{Huber}, h}$ with $\mu = -2, \eta = 2, h = 0.25, p = 10000, s = \left\lfloor \frac{p}{2} - 10\sqrt{p}\right\rfloor = 4000, m_{\text{Huber}} \approx 1.750$.
Figure 4: Cartoon schematic of the case organization of the proof of Proposition \ref{['prop:fmax_order']} in the special case $k = 2$. The solid line represents the function $f$. For a given point $v$, the strategy in the proof is to find a $w(v)$ in the shaded region such that $f(v) \leq f(w(v))$. The shaded region represents $[\mu, \mu+\Delta(h)]$, the dashed line between Case 1 and Case 2 is located at $\frac{\mu+\eta_1}{2}$, the other two dashed lines are located at $y_1^*$ and $y_2^*$ respectively, and the dotted line is located at $\frac{\mu+\eta_2}{2}$.

Theorems & Definitions (141)

Definition 1
Remark 1: Testing vs estimation
Remark 2: Large-scale inference
Proposition 1
Proposition 2
Remark 3: Strong restricted eigenvalue condition
Theorem 1
Proposition 3
Proposition 4
Theorem 2
...and 131 more

Sparsity meets correlation in Gaussian sequence model

TL;DR

Abstract

Sparsity meets correlation in Gaussian sequence model

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (141)