Table of Contents
Fetching ...

Statistical Query Lower Bounds for Learning Truncated Gaussians

Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

TL;DR

A Statistical Query (SQ) lower bound applies when $\mathcal{C}$ is a union of a bounded number of rectangles whose VC dimension and Gaussian surface are small, and shows that the complexity of any SQ algorithm for this problem is d^{\mathrm{poly}(1/\epsilon)}$, even when the class $\mathcal{C}$ is simple.

Abstract

We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family $\mathcal{C}$ of sets. Specifically, for a fixed but unknown truncation set $S \subseteq \mathbb{R}^d$, we are given access to samples from the distribution $\mathcal{N}(\boldsymbol{ μ}, \mathbf{ I})$ truncated to the set $S$. The goal is to estimate $\boldsymbolμ$ within accuracy $ε>0$ in $\ell_2$-norm. Our main result is a Statistical Query (SQ) lower bound suggesting a super-polynomial information-computation gap for this task. In more detail, we show that the complexity of any SQ algorithm for this problem is $d^{\mathrm{poly}(1/ε)}$, even when the class $\mathcal{C}$ is simple so that $\mathrm{poly}(d/ε)$ samples information-theoretically suffice. Concretely, our SQ lower bound applies when $\mathcal{C}$ is a union of a bounded number of rectangles whose VC dimension and Gaussian surface are small. As a corollary of our construction, it also follows that the complexity of the previously known algorithm for this task is qualitatively best possible.

Statistical Query Lower Bounds for Learning Truncated Gaussians

TL;DR

A Statistical Query (SQ) lower bound applies when is a union of a bounded number of rectangles whose VC dimension and Gaussian surface are small, and shows that the complexity of any SQ algorithm for this problem is d^{\mathrm{poly}(1/\epsilon)}\mathcal{C}$ is simple.

Abstract

We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family of sets. Specifically, for a fixed but unknown truncation set , we are given access to samples from the distribution truncated to the set . The goal is to estimate within accuracy in -norm. Our main result is a Statistical Query (SQ) lower bound suggesting a super-polynomial information-computation gap for this task. In more detail, we show that the complexity of any SQ algorithm for this problem is , even when the class is simple so that samples information-theoretically suffice. Concretely, our SQ lower bound applies when is a union of a bounded number of rectangles whose VC dimension and Gaussian surface are small. As a corollary of our construction, it also follows that the complexity of the previously known algorithm for this task is qualitatively best possible.
Paper Structure (17 sections, 12 theorems, 43 equations, 1 figure)

This paper contains 17 sections, 12 theorems, 43 equations, 1 figure.

Key Result

Theorem 1.2

Let $d, k\in \mathbb Z_+$, $\varepsilon>d^{-c}$ for some sufficiently small constant $c>0$, and assume $k \leq c/\varepsilon^{0.15}$. Let $\mathcal{C}$ be the class of all sets $S \subseteq \mathbb R^d$ with the properties that: (i) $S$ is the complement of a union of at most $k^2$ rectangles, and (

Figures (1)

  • Figure 1: Truncation set in $\mathbb{R}^2$. The red dotted parts of the horizontal and vertical axes represent the unions of intervals $U$ and $T$, respectively. The white rectangles represent the set $T \times U$, and the remaining gray area of $\mathbb{R}^2$ is their complement, denoted as $(T \times U)^c$, on which we truncate the Gaussian distribution $\mathcal{N}((\varepsilon,0),\mathbf{I}_{2\times 2})$.

Theorems & Definitions (35)

  • Definition 1.1: STAT Oracle
  • Theorem 1.2: SQ Lower Bound for Learning Truncated Gaussians
  • Definition 2.1: Truncated Gaussian
  • Definition 2.2: Gaussian Surface Area
  • Definition 2.3: Hidden Direction Distribution
  • Theorem 3.1: SQ Lower Bound; Hypothesis Testing Hardness
  • Proposition 3.1
  • Lemma 3.2
  • Lemma 3.3
  • proof : Proof of \ref{['lem:A']}
  • ...and 25 more