Table of Contents
Fetching ...

PTF Testing Lower Bounds for Non-Gaussian Component Analysis

Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, Thanasis Pittas

TL;DR

This work establishes the first non-trivial lower bounds for the broad class of Polynomial Threshold Function (PTF) tests in statistical hypothesis testing, focusing on Non-Gaussian Component Analysis (NGCA). The authors connect PTF hardness to pseudorandom generators for PTFs and develop a novel toolkit centered on mollification, a hybrid argument, and directional-derivative control along random directions to bound the power of degree-$k$ tests. Their main result shows a near-optimal information-computation tradeoff: unless the sample size $n$ or the tester degree $k$ is large (specifically, $n leq d^{(1/4-c^*)m}$ or $k leq d^{c^*/C^*}$), a degree-$k$ PTF cannot reliably distinguish the NGCA null and alternative. This yields broad hardness implications for NGCA and, by extension, several robust statistics and mixture-model tasks, strengthening prior SQ/LDP bounds and revealing subtle gaps between LDP and general PTF capabilities. The techniques—mollification with strong anti-concentration, Taylor expansions along random directions, and a carefully tuned mollifier—offer new structural tools for analyzing low-degree polynomial thresholds in high dimensions with moment-matching alternatives.

Abstract

This work studies information-computation gaps for statistical problems. A common approach for providing evidence of such gaps is to show sample complexity lower bounds (that are stronger than the information-theoretic optimum) against natural models of computation. A popular such model in the literature is the family of low-degree polynomial tests. While these tests are defined in such a way that make them easy to analyze, the class of algorithms that they rule out is somewhat restricted. An important goal in this context has been to obtain lower bounds against the stronger and more natural class of low-degree Polynomial Threshold Function (PTF) tests, i.e., any test that can be expressed as comparing some low-degree polynomial of the data to a threshold. Proving lower bounds against PTF tests has turned out to be challenging. Indeed, we are not aware of any non-trivial PTF testing lower bounds in the literature. In this paper, we establish the first non-trivial PTF testing lower bounds for a range of statistical tasks. Specifically, we prove a near-optimal PTF testing lower bound for Non-Gaussian Component Analysis (NGCA). Our NGCA lower bound implies similar lower bounds for a number of other statistical problems. Our proof leverages a connection to recent work on pseudorandom generators for PTFs and recent techniques developed in that context. At the technical level, we develop several tools of independent interest, including novel structural results for analyzing the behavior of low-degree polynomials restricted to random directions.

PTF Testing Lower Bounds for Non-Gaussian Component Analysis

TL;DR

This work establishes the first non-trivial lower bounds for the broad class of Polynomial Threshold Function (PTF) tests in statistical hypothesis testing, focusing on Non-Gaussian Component Analysis (NGCA). The authors connect PTF hardness to pseudorandom generators for PTFs and develop a novel toolkit centered on mollification, a hybrid argument, and directional-derivative control along random directions to bound the power of degree- tests. Their main result shows a near-optimal information-computation tradeoff: unless the sample size or the tester degree is large (specifically, or ), a degree- PTF cannot reliably distinguish the NGCA null and alternative. This yields broad hardness implications for NGCA and, by extension, several robust statistics and mixture-model tasks, strengthening prior SQ/LDP bounds and revealing subtle gaps between LDP and general PTF capabilities. The techniques—mollification with strong anti-concentration, Taylor expansions along random directions, and a carefully tuned mollifier—offer new structural tools for analyzing low-degree polynomial thresholds in high dimensions with moment-matching alternatives.

Abstract

This work studies information-computation gaps for statistical problems. A common approach for providing evidence of such gaps is to show sample complexity lower bounds (that are stronger than the information-theoretic optimum) against natural models of computation. A popular such model in the literature is the family of low-degree polynomial tests. While these tests are defined in such a way that make them easy to analyze, the class of algorithms that they rule out is somewhat restricted. An important goal in this context has been to obtain lower bounds against the stronger and more natural class of low-degree Polynomial Threshold Function (PTF) tests, i.e., any test that can be expressed as comparing some low-degree polynomial of the data to a threshold. Proving lower bounds against PTF tests has turned out to be challenging. Indeed, we are not aware of any non-trivial PTF testing lower bounds in the literature. In this paper, we establish the first non-trivial PTF testing lower bounds for a range of statistical tasks. Specifically, we prove a near-optimal PTF testing lower bound for Non-Gaussian Component Analysis (NGCA). Our NGCA lower bound implies similar lower bounds for a number of other statistical problems. Our proof leverages a connection to recent work on pseudorandom generators for PTFs and recent techniques developed in that context. At the technical level, we develop several tools of independent interest, including novel structural results for analyzing the behavior of low-degree polynomials restricted to random directions.

Paper Structure

This paper contains 38 sections, 19 theorems, 136 equations, 1 table.

Key Result

Theorem 1.5

There exists a sufficiently large absolute constant $C^*$ such that the following holds. For any $c^* \in (0, 1/4)$, $d,k,n,m \in \mathbb Z_+$ such that (i) $m$ is even, (ii) $\max(k,m) < d^{c^*/C^*}$, and (iii) $n < d^{ (1/4 - c^*) m}$, we have that if $p: \mathbb{R}^{n \times d} \mapsto \mathbb{R} where $\mathcal{M}_{A,\mathbf{v}}$ denotes the hidden direction distribution from def:hidden_distr,

Theorems & Definitions (52)

  • Definition 1.1: $\gamma$-advantageous polynomial
  • Definition 1.2: $\beta$-good PTF test
  • Definition 1.3: High-Dimensional Hidden Direction Distribution
  • Theorem 1.5: Main Result
  • Lemma 3.1: Derivative Decay
  • proof : Proof of \ref{['lem:derivative-decay']}
  • Claim 3.3
  • proof : Proof of \ref{['eq:W(v)-bound']}
  • Lemma 3.4: Sandwiching
  • proof
  • ...and 42 more