Table of Contents
Fetching ...

Sparsity meets correlation in Gaussian sequence model

Subhodh Kotekal, Chao Gao

TL;DR

This work derives sharp minimax rates for estimating an $s$-sparse signal in a $p$-dimensional Gaussian sequence model with equicorrelated observations, revealing a phase transition driven by $p-2s$ and a nontrivial dependence on the correlation level $\gamma$. The authors decompose the problem into estimating the high-dimensional projection $\theta-\bar{\theta}\mathbf{1}_p$ via sparse regression and estimating the linear functional $\bar{\theta}$ using a kernel mode estimator, with oversmoothing (bandwidth widening) exploiting Gaussian structure to achieve optimal rates. They establish a precise piecewise rate formula $\varepsilon^*(p,s,\gamma)^2$, prove matching lower bounds, and develop adaptive procedures that achieve the minimax rate without knowledge of $s$ or $\gamma$, including a correlation estimator and Lepski-type tuning. The results illuminate when correlation is a blessing versus a curse and have implications for large-scale inference under dependency, robust estimation of location parameters, and extensions to multi-block correlation structures.

Abstract

We study estimation of an $s$-sparse signal in the $p$-dimensional Gaussian sequence model with equicorrelated observations and derive the minimax rate. A new phenomenon emerges from correlation, namely the rate scales with respect to $p-2s$ and exhibits a phase transition at $p-2s \asymp \sqrt{p}$. Correlation is shown to be a blessing provided it is sufficiently strong, and the critical correlation level exhibits a delicate dependence on the sparsity level. Due to correlation, the minimax rate is driven by two subproblems: estimation of a linear functional (the average of the signal) and estimation of the signal's $(p-1)$-dimensional projection onto the orthogonal subspace. The high-dimensional projection is estimated via sparse regression and the linear functional is cast as a robust location estimation problem. Existing robust estimators turn out to be suboptimal, and we show a kernel mode estimator with a widening bandwidth exploits the Gaussian character of the data to achieve the optimal estimation rate.

Sparsity meets correlation in Gaussian sequence model

TL;DR

This work derives sharp minimax rates for estimating an -sparse signal in a -dimensional Gaussian sequence model with equicorrelated observations, revealing a phase transition driven by and a nontrivial dependence on the correlation level . The authors decompose the problem into estimating the high-dimensional projection via sparse regression and estimating the linear functional using a kernel mode estimator, with oversmoothing (bandwidth widening) exploiting Gaussian structure to achieve optimal rates. They establish a precise piecewise rate formula , prove matching lower bounds, and develop adaptive procedures that achieve the minimax rate without knowledge of or , including a correlation estimator and Lepski-type tuning. The results illuminate when correlation is a blessing versus a curse and have implications for large-scale inference under dependency, robust estimation of location parameters, and extensions to multi-block correlation structures.

Abstract

We study estimation of an -sparse signal in the -dimensional Gaussian sequence model with equicorrelated observations and derive the minimax rate. A new phenomenon emerges from correlation, namely the rate scales with respect to and exhibits a phase transition at . Correlation is shown to be a blessing provided it is sufficiently strong, and the critical correlation level exhibits a delicate dependence on the sparsity level. Due to correlation, the minimax rate is driven by two subproblems: estimation of a linear functional (the average of the signal) and estimation of the signal's -dimensional projection onto the orthogonal subspace. The high-dimensional projection is estimated via sparse regression and the linear functional is cast as a robust location estimation problem. Existing robust estimators turn out to be suboptimal, and we show a kernel mode estimator with a widening bandwidth exploits the Gaussian character of the data to achieve the optimal estimation rate.
Paper Structure (59 sections, 79 theorems, 393 equations, 4 figures)

This paper contains 59 sections, 79 theorems, 393 equations, 4 figures.

Key Result

Proposition 1

If $1 \leq s \leq p$ and $\gamma \in [0, 1]$, then for any $\delta \in (0, 1)$ we have

Figures (4)

  • Figure 1: Plots of the rate $\varepsilon^*(p, s, \gamma)^2$ against $s$ with $p = 100$ and $\gamma = 1-p^{-\kappa}$ for various choices of $\kappa$.
  • Figure 2: Plots of $G_h$ and $J_h$ with $\mu = -2, \eta = 2, h = 0.25, p = 10000, s = \left\lfloor \frac{p}{2} - 10\sqrt{p}\right\rfloor = 4000, m \approx -1.999$.
  • Figure 3: Plots of $G_{\text{Huber}, h}$ and $J_{\text{Huber}, h}$ with $\mu = -2, \eta = 2, h = 0.25, p = 10000, s = \left\lfloor \frac{p}{2} - 10\sqrt{p}\right\rfloor = 4000, m_{\text{Huber}} \approx 1.750$.
  • Figure 4: Cartoon schematic of the case organization of the proof of Proposition \ref{['prop:fmax_order']} in the special case $k = 2$. The solid line represents the function $f$. For a given point $v$, the strategy in the proof is to find a $w(v)$ in the shaded region such that $f(v) \leq f(w(v))$. The shaded region represents $[\mu, \mu+\Delta(h)]$, the dashed line between Case 1 and Case 2 is located at $\frac{\mu+\eta_1}{2}$, the other two dashed lines are located at $y_1^*$ and $y_2^*$ respectively, and the dotted line is located at $\frac{\mu+\eta_2}{2}$.

Theorems & Definitions (141)

  • Definition 1
  • Remark 1: Testing vs estimation
  • Remark 2: Large-scale inference
  • Proposition 1
  • Proposition 2
  • Remark 3: Strong restricted eigenvalue condition
  • Theorem 1
  • Proposition 3
  • Proposition 4
  • Theorem 2
  • ...and 131 more