Table of Contents
Fetching ...

Separating Oblivious and Adaptive Models of Variable Selection

Ziyun Chen, Jerry Li, Kevin Tian, Yusong Zhu

TL;DR

Under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound.

Abstract

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with $\ell_\infty$ error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a $k$-sparse signal in $\mathbb{R}^d$. Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of $\ell_\infty$ sparse recovery. We show that under an oblivious model, the optimal $\ell_\infty$ error is attainable in near-linear time with $\approx k\log d$ samples, whereas in an adaptive model, $\gtrsim k^2$ samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard $\ell_2$ setting, where $\approx k \log d$ samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with $\approx k\log d$ measurements.

Separating Oblivious and Adaptive Models of Variable Selection

TL;DR

Under an oblivious model, the optimal error is attainable in near-linear time with samples, whereas in an adaptive model, samples are necessary for any algorithm to achieve this bound.

Abstract

Sparse recovery is among the most well-studied problems in learning theory and high-dimensional statistics. In this work, we investigate the statistical and computational landscapes of sparse recovery with error guarantees. This variant of the problem is motivated by \emph{variable selection} tasks, where the goal is to estimate the support of a -sparse signal in . Our main contribution is a provable separation between the \emph{oblivious} (``for each'') and \emph{adaptive} (``for all'') models of sparse recovery. We show that under an oblivious model, the optimal error is attainable in near-linear time with samples, whereas in an adaptive model, samples are necessary for any algorithm to achieve this bound. This establishes a surprising contrast with the standard setting, where samples suffice even for adaptive sparse recovery. We conclude with a preliminary examination of a \emph{partially-adaptive} model, where we show nontrivial variable selection guarantees are possible with measurements.
Paper Structure (32 sections, 41 theorems, 152 equations, 5 algorithms)

This paper contains 32 sections, 41 theorems, 152 equations, 5 algorithms.

Key Result

Theorem 1

Let $n = \Omega(k \log d)$, and let ${\mathbf{X}} \in \mathbb{R}^{n \times d}$ have i.i.d. $\mathcal{N}(0, \frac{1}{n})$ entries. There is an estimator which solves Problem prob:linfty_sparse under Model model:for_each with high probability. Moreover, the estimator can be computed in nearly-linear t

Theorems & Definitions (75)

  • Theorem 1: informal, see Theorem \ref{['thm:l_inf_norm_bound_model1']}
  • Theorem 2: informal, see Theorems \ref{['thm:linf_iht']}, \ref{['thm:hard_inst_lower_bound']}, and \ref{['thm:support_lb']}
  • Definition 1: sub-Gaussian distribution
  • Lemma 1: Hoeffding's inequality, Theorem 2.2.1, vershynin2018high
  • Lemma 2: Proposition 2.6.1, vershynin2018high
  • Lemma 3
  • proof
  • Definition 2: Restricted isometry property
  • Proposition 1: Theorem 9.2, foucart13
  • Definition 3: Pairwise incoherence
  • ...and 65 more