Table of Contents
Fetching ...

Low coordinate degree algorithms I: Universality of computational thresholds for hypothesis testing

Dmitriy Kunisky

TL;DR

These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.

Abstract

We study when low coordinate degree functions (LCDF) -- linear combinations of functions depending on small subsets of entries of a vector -- can hypothesis test between high-dimensional probability measures. These functions are a generalization, proposed in Hopkins' 2018 thesis but seldom studied since, of low degree polynomials (LDP), a class widely used in recent literature as a proxy for all efficient algorithms for tasks in statistics and optimization. Instead of the orthogonal polynomial decompositions used in LDP calculations, our analysis of LCDF is based on the Efron-Stein or ANOVA decomposition, making it much more broadly applicable. By way of illustration, we prove channel universality for the success of LCDF in testing for the presence of sufficiently "dilute" random signals through noisy channels: the efficacy of LCDF depends on the channel only through the scalar Fisher information for a class of channels including nearly arbitrary additive i.i.d. noise and nearly arbitrary exponential families. As applications, we extend lower bounds against LDP for spiked matrix and tensor models under additive Gaussian noise to lower bounds against LCDF under general noisy channels. We also give a simple and unified treatment of the effect of censoring models by erasing observations at random and of quantizing models by taking the sign of the observations. These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.

Low coordinate degree algorithms I: Universality of computational thresholds for hypothesis testing

TL;DR

These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.

Abstract

We study when low coordinate degree functions (LCDF) -- linear combinations of functions depending on small subsets of entries of a vector -- can hypothesis test between high-dimensional probability measures. These functions are a generalization, proposed in Hopkins' 2018 thesis but seldom studied since, of low degree polynomials (LDP), a class widely used in recent literature as a proxy for all efficient algorithms for tasks in statistics and optimization. Instead of the orthogonal polynomial decompositions used in LDP calculations, our analysis of LCDF is based on the Efron-Stein or ANOVA decomposition, making it much more broadly applicable. By way of illustration, we prove channel universality for the success of LCDF in testing for the presence of sufficiently "dilute" random signals through noisy channels: the efficacy of LCDF depends on the channel only through the scalar Fisher information for a class of channels including nearly arbitrary additive i.i.d. noise and nearly arbitrary exponential families. As applications, we extend lower bounds against LDP for spiked matrix and tensor models under additive Gaussian noise to lower bounds against LCDF under general noisy channels. We also give a simple and unified treatment of the effect of censoring models by erasing observations at random and of quantizing models by taking the sign of the observations. These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.
Paper Structure (50 sections, 32 theorems, 118 equations)

This paper contains 50 sections, 32 theorems, 118 equations.

Key Result

Proposition 1.5

For any prior $\mathcal{X}$ and $\sigma^2 > 0$, where $\bm x^1$ and $\bm x^2$ are independent draws from $\mathcal{X}$ and $\exp^{\leq D}$ is the truncated Taylor series of the exponential function,

Theorems & Definitions (91)

  • Definition 1.1: Continuous latent variable model
  • Definition 1.2: Strong detection
  • Remark 1.3: Signal-to-noise parameters
  • Conjecture 1.4: Informal low degree conjecture
  • Proposition 1.5: Additive Gaussian advantage; Theorem 2.6 of KWB-2022-LowDegreeNotes
  • Definition 1.6: Coordinate degree
  • Theorem 1.7: Informal
  • Definition 3.1: Good CLVM
  • Definition 3.2: Channel overlap
  • Example 3.3
  • ...and 81 more