Table of Contents
Fetching ...

Sum-of-squares lower bounds for Non-Gaussian Component Analysis

Ilias Diakonikolas, Sushrut Karmalkar, Shuo Pang, Aaron Potechin

TL;DR

The main contribution is the first super-constant degree SoS lower bound for NGCA, which significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms.

Abstract

Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Specifically, given i.i.d.\ samples from a distribution $P^A_{v}$ on $\mathbb{R}^n$ that behaves like a known distribution $A$ in a hidden direction $v$ and like a standard Gaussian in the orthogonal complement, the goal is to approximate the hidden direction. The standard formulation posits that the first $k-1$ moments of $A$ match those of the standard Gaussian and the $k$-th moment differs. Under mild assumptions, this problem has sample complexity $O(n)$. On the other hand, all known efficient algorithms require $Ω(n^{k/2})$ samples. Prior work developed sharp Statistical Query and low-degree testing lower bounds suggesting an information-computation tradeoff for this problem. Here we study the complexity of NGCA in the Sum-of-Squares (SoS) framework. Our main contribution is the first super-constant degree SoS lower bound for NGCA. Specifically, we show that if the non-Gaussian distribution $A$ matches the first $(k-1)$ moments of $\mathcal{N}(0, 1)$ and satisfies other mild conditions, then with fewer than $n^{(1 - \varepsilon)k/2}$ many samples from the normal distribution, with high probability, degree $(\log n)^{{1\over 2}-o_n(1)}$ SoS fails to refute the existence of such a direction $v$. Our result significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms. As corollaries, we obtain SoS lower bounds for several problems in robust statistics and the learning of mixture models. Our SoS lower bound proof introduces a novel technique, that we believe may be of broader interest, and a number of refinements over existing methods.

Sum-of-squares lower bounds for Non-Gaussian Component Analysis

TL;DR

The main contribution is the first super-constant degree SoS lower bound for NGCA, which significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms.

Abstract

Non-Gaussian Component Analysis (NGCA) is the statistical task of finding a non-Gaussian direction in a high-dimensional dataset. Specifically, given i.i.d.\ samples from a distribution on that behaves like a known distribution in a hidden direction and like a standard Gaussian in the orthogonal complement, the goal is to approximate the hidden direction. The standard formulation posits that the first moments of match those of the standard Gaussian and the -th moment differs. Under mild assumptions, this problem has sample complexity . On the other hand, all known efficient algorithms require samples. Prior work developed sharp Statistical Query and low-degree testing lower bounds suggesting an information-computation tradeoff for this problem. Here we study the complexity of NGCA in the Sum-of-Squares (SoS) framework. Our main contribution is the first super-constant degree SoS lower bound for NGCA. Specifically, we show that if the non-Gaussian distribution matches the first moments of and satisfies other mild conditions, then with fewer than many samples from the normal distribution, with high probability, degree SoS fails to refute the existence of such a direction . Our result significantly strengthens prior work by establishing a super-polynomial information-computation tradeoff against a broader family of algorithms. As corollaries, we obtain SoS lower bounds for several problems in robust statistics and the learning of mixture models. Our SoS lower bound proof introduces a novel technique, that we believe may be of broader interest, and a number of refinements over existing methods.

Paper Structure

This paper contains 80 sections, 104 theorems, 239 equations, 10 figures, 1 table.

Key Result

Theorem 1.2

Given $n$, suppose $2\leq k\leq (\log n)^{O(1)}$ and $A$ is a distribution on $\mathbb R$ such that: If $n$ is sufficiently large, then with high probability, given fewer than $n^{(1-\varepsilon)k/2}$ many samples, Sum-of-Squares of degree $o(\sqrt{\frac{\varepsilon\log n}{ \log\log n}})$ fails to distinguish between the random and planted distributions for the corresponding NGCA def:ngca_disting

Figures (10)

  • Figure 1: Examples of simple spiders where the left and right indices have size two. If $k\geq 3$, the simple spider with one circle vertex and one label-1 edge to each side has coefficient 0 in $M$, so it is not drawn.
  • Figure 2: Examples of simple spiders where the left and right indices have size three.
  • Figure 3: Examples of the decomposition into left, middle, and right parts. The sets $S_l$ and $S_r$ are shown with dotted ovals, and assuming $n \leq m \leq n^2$, a minimum weight vertex separator is shown in red.
  • Figure 4: This figure shows the approximate decomposition of $S(2,2;0)$ using minimum weight vertex separators (assuming that $m \leq n^2$). While natural, this decomposition does not work for our analysis.
  • Figure 5: An intersection configuration for the product $M_{\sigma}M_{\sigma^{\top}}$ where $\sigma$ is the left shape shown on the left. The orange arch between shape vertices $w_2$ and $w_3$ means that the two vertices coincide in the ribbon products being represented. The dotted set of vertices in $\sigma$ is the leftmost minimum square separator between $U_{\sigma}$ and the union of the set of intersected vertices and $V_{\sigma}$. Similarly the dotted set of vertices in $\sigma^{\top}$ is the rightmost minimum vertex separator between the union of $U_{\sigma^{\top}}$ and the set of the intersected vertices and $V_{\sigma^{\top}}$.
  • ...and 5 more figures

Theorems & Definitions (296)

  • Theorem 1.2: Main Theorem, Informal
  • Remark 1.3
  • Remark 1.4
  • Definition 2.1
  • Definition 2.2: Pseudo-expectation Operator for NGCA
  • Remark 2.3
  • Remark 2.4: Soft NGCA constraints
  • Definition 2.5
  • Definition 2.6: Moment matrix
  • Lemma 2.8: Pseudo-calibration
  • ...and 286 more