Table of Contents
Fetching ...

Efficient reductions from a Gaussian source with applications to statistical-computational tradeoffs

Mengqi Lou, Guy Bresler, Ashwin Pananjady

TL;DR

This work develops a general framework to convert a single Gaussian observation into a target distribution via polynomial-time reductions, quantified by TV deficiency bounds that hold uniformly over a parameter set. Central to the method is a signed-kernel construction and a rejection-sampling step that yields a Markov kernel mapping the Gaussian source to broad target classes, including non-Gaussian locations and Gaussian means transformed by a nonlinear link. The reductions enable universal hardness transfers, proving, among other results, that Tensor PCA hardness persists under non-Gaussian noise, that a k^2 computational gap arises for even-link sparse GLMs, and that Rank1SubMat and PlantSubMat share a pointwise-hardness-preserving reduction. Collectively, the results establish robust connections across canonical high-dimensional problems, linking statistical-to-computational barriers through a unified Gaussian-source reduction technique. The framework offers a path to transferring hardness results beyond Gaussian noise and to broader model classes, with polylogarithmic inflation of SNR in the target and polynomial-time computability, broadening the scope of universality in computational barriers for statistical problems.

Abstract

Given a single observation from a Gaussian distribution with unknown mean $θ$, we design computationally efficient procedures that can approximately generate an observation from a different target distribution $Q_θ$ uniformly for all $θ$ in a parameter set. We leverage our technique to establish reduction-based computational lower bounds for several canonical high-dimensional statistical models under widely-believed conjectures in average-case complexity. In particular, we cover cases in which: 1. $Q_θ$ is a general location model with non-Gaussian distribution, including both light-tailed examples (e.g., generalized normal distributions) and heavy-tailed ones (e.g., Student's $t$-distributions). As a consequence, we show that computational lower bounds proved for spiked tensor PCA with Gaussian noise are universal, in that they extend to other non-Gaussian noise distributions within our class. 2. $Q_θ$ is a normal distribution with mean $f(θ)$ for a general, smooth, and nonlinear link function $f:\mathbb{R} \rightarrow \mathbb{R}$. Using this reduction, we construct a reduction from symmetric mixtures of linear regressions to generalized linear models with link function $f$, and establish computational lower bounds for solving the $k$-sparse generalized linear model when $f$ is an even function. This result constitutes the first reduction-based confirmation of a $k$-to-$k^2$ statistical-to-computational gap in $k$-sparse phase retrieval, resolving a conjecture posed by Cai et al. (2016). As a second application, we construct a reduction from the sparse rank-1 submatrix model to the planted submatrix model, establishing a pointwise correspondence between the phase diagrams of the two models that faithfully preserves regions of computational hardness and tractability.

Efficient reductions from a Gaussian source with applications to statistical-computational tradeoffs

TL;DR

This work develops a general framework to convert a single Gaussian observation into a target distribution via polynomial-time reductions, quantified by TV deficiency bounds that hold uniformly over a parameter set. Central to the method is a signed-kernel construction and a rejection-sampling step that yields a Markov kernel mapping the Gaussian source to broad target classes, including non-Gaussian locations and Gaussian means transformed by a nonlinear link. The reductions enable universal hardness transfers, proving, among other results, that Tensor PCA hardness persists under non-Gaussian noise, that a k^2 computational gap arises for even-link sparse GLMs, and that Rank1SubMat and PlantSubMat share a pointwise-hardness-preserving reduction. Collectively, the results establish robust connections across canonical high-dimensional problems, linking statistical-to-computational barriers through a unified Gaussian-source reduction technique. The framework offers a path to transferring hardness results beyond Gaussian noise and to broader model classes, with polylogarithmic inflation of SNR in the target and polynomial-time computability, broadening the scope of universality in computational barriers for statistical problems.

Abstract

Given a single observation from a Gaussian distribution with unknown mean , we design computationally efficient procedures that can approximately generate an observation from a different target distribution uniformly for all in a parameter set. We leverage our technique to establish reduction-based computational lower bounds for several canonical high-dimensional statistical models under widely-believed conjectures in average-case complexity. In particular, we cover cases in which: 1. is a general location model with non-Gaussian distribution, including both light-tailed examples (e.g., generalized normal distributions) and heavy-tailed ones (e.g., Student's -distributions). As a consequence, we show that computational lower bounds proved for spiked tensor PCA with Gaussian noise are universal, in that they extend to other non-Gaussian noise distributions within our class. 2. is a normal distribution with mean for a general, smooth, and nonlinear link function . Using this reduction, we construct a reduction from symmetric mixtures of linear regressions to generalized linear models with link function , and establish computational lower bounds for solving the -sparse generalized linear model when is an even function. This result constitutes the first reduction-based confirmation of a -to- statistical-to-computational gap in -sparse phase retrieval, resolving a conjecture posed by Cai et al. (2016). As a second application, we construct a reduction from the sparse rank-1 submatrix model to the planted submatrix model, establishing a pointwise correspondence between the phase diagrams of the two models that faithfully preserves regions of computational hardness and tractability.

Paper Structure

This paper contains 97 sections, 25 theorems, 379 equations, 4 figures.

Key Result

Proposition 1

Consider the source model $\mathcal{U} = (\mathbb{R}, \{ \mathcal{N}(\theta, \sigma^{2}) \}_{\theta \in \Theta})$ and general target model $\mathcal{V} = (\mathbb{R},\{\mathcal{Q}_{\theta}\}_{\theta \in \Theta})$. Let $u(\cdot;\theta)$ be defined in Eq. source-gaussian-density and let $v(\cdot;\thet Consider the signed kernel defined as Then we have $\varsigma( \mathcal{U}, \mathcal{V}; \mathcal{

Figures (4)

  • Figure 1: Phase diagrams for the sparse mixture of linear regressions ($\mathsf{MixSLR}$; panel (a)) and sparse phase retrieval ($\mathsf{SPR}$; panel (b)), under the parameterization $k=\widetilde{\Theta}(n^{\beta})$ and $\eta^{4}=\widetilde{\Theta}(n^{-\alpha})$, with constants $\alpha,\beta\in(0,1)$. The purple region denotes the statistically impossible regime; the red region denotes the statistically possible but computationally intractable regime; the green region denotes the computationally tractable regime; and the black region denotes a statistically possible yet conjectured computationally hard regime, whose proof remains open. Our goal is to transfer all computationally hard regimes confirmed in panel (a) to panel (b).
  • Figure 2: Phase diagrams for the sparse rank-1 submatrix model ($\mathsf{Rank1SubMat}$; panel (a)) and the planted submatrix model ($\mathsf{PlantSubMat}$; panel (b)), under the parameterization $\mu/k = \widetilde{\Theta}(n^{-\alpha})$ and $k = \widetilde{\Theta}(n^{\beta})$ for $\alpha, \beta \in (0,1)$. The purple region indicates the statistically impossible regime. The green region corresponds to the computationally tractable regime where polynomial-time algorithms are known to succeed. The red region indicates the statistically possible but computationally intractable regime. In panel (a), the computational threshold is given by the line $\alpha = \beta - 1/2$, while in panel (b), the threshold is the line $\alpha = 2\beta - 1$.
  • Figure 3: Phase diagrams for the mixture of sparse linear regressions ($\mathsf{MixSLR}$; panel (a)) and sparse phase retrieval ($\mathsf{SPR}$; panel (b)), under the parameterization $k=\widetilde{\Theta}(n^{\beta})$ and $\eta^{4}=\widetilde{\Theta}(n^{-\alpha})$, with constants $\alpha,\beta\in(0,1)$. The purple region denotes the statistically impossible regime; the red region denotes the statistically possible but computationally intractable regime; the green region denotes the computationally tractable regime; and the black region denotes a statistically possible yet conjectured computationally hard regime, whose proof remains open. The reduction $\mathcal{R}$ maps $\mathsf{MixSLR}$ instances with parameters $(\alpha,\beta)$ to $\mathsf{SPR}$ instances with the same $(\alpha,\beta)$ for all $(\alpha, \beta)$ in the phase diagram, so any computational hardness regime established in panel (a) transfers directly to panel (b).
  • Figure 4: Phase diagrams for the sparse rank-1 submatrix model ($\mathsf{Rank1SubMat}$; panel (a)) and the planted submatrix model ($\mathsf{PlantSubMat}$; panel (b)), under the parameterization $\mu/k = \widetilde{\Theta}(n^{-\alpha})$ and $k = \widetilde{\Theta}(n^{\beta})$ for $\alpha, \beta \in (0,1)$. The purple region indicates the statistically impossible regime. The green region corresponds to the computationally tractable regime where polynomial-time algorithms are known to succeed. The red region indicates the statistically possible but computationally intractable regime. In panel (a), the computational threshold is given by the line $\alpha = \beta - 1/2$, while in panel (b), the threshold is the line $\alpha = 2\beta - 1$. The reduction $\mathcal{R}$ established in Theorem \ref{['thm:reduction-ROS-BC']} maps instances with parameters $(\alpha, \beta)$ in $\mathsf{Rank1SubMat}$ to instances with parameters $(2\alpha, \beta)$ in $\mathsf{PlantSubMat}$. This transformation demonstrates a pointwise correspondence between the phase diagrams, preserving the computational tractability and intractability of problem instances.

Theorems & Definitions (37)

  • Remark 1
  • Proposition 1
  • Definition 1
  • Lemma 1
  • Definition 2
  • Theorem 1
  • Corollary 1
  • Example 1
  • Remark 2
  • Corollary 2: Generalized normal target
  • ...and 27 more