Table of Contents
Fetching ...

Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time?

Andrea Montanari, Kangjie Zhou

TL;DR

This work characterizes the set of low-dimensional projections of a Gaussian cloud that can be realized efficiently when the projection dimension $m$ is fixed and the ambient dimension scales proportionally with $n/d \to \alpha$. It develops a two-stage Incremental AMP (IAMP) algorithm that yields an inner bound \\mathscr{F}^{\rm AMP}_{m,\alpha}$ on the feasible set, with a stochastic-integral representation for the resulting distributions. A duality framework connects the AMP objective to a generalized Parisi variational principle, yielding a Parisi PDE for the corresponding value function and establishing strong duality in regimes governed by a no-overlap-gap condition. The analysis combines two-stage AMP theory, stochastic optimal control, and PDE techniques to derive tractable, polynomial-time projections and to provide rigorous justifications for replica-based predictions in a high-dimensional projection setting. The results also yield computationally attainable values for random optimization problems like generalized spherical perceptron models, linking algorithmic feasibility to a Parisi-type landscape.

Abstract

Given $d$-dimensional standard Gaussian vectors $\boldsymbol{x}_1,\dots, \boldsymbol{x}_n$, we consider the set of all empirical distributions of its $m$-dimensional projections, for $m$ a fixed constant. Diaconis and Freedman (1984) proved that, if $n/d\to \infty$, all such distributions converge to the standard Gaussian distribution. In contrast, we study the proportional asymptotics, whereby $n,d\to \infty$ with $n/d\to α\in (0, \infty)$. In this case, the projection of the data points along a typical random subspace is again Gaussian, but the set $\mathscr{F}_{m,α}$ of all probability distributions that are asymptotically feasible as $m$-dimensional projections contains non-Gaussian distributions corresponding to exceptional subspaces. Non-rigorous methods from statistical physics yield an indirect characterization of $\mathscr{F}_{m,α}$ in terms of a generalized Parisi formula. Motivated by the goal of putting this formula on a rigorous basis, and to understand whether these projections can be found efficiently, we study the subset $\mathscr{F}^{\rm alg}_{m,α}\subseteq \mathscr{F}_{m,α}$ of distributions that can be realized by a class of iterative algorithms. We prove that this set is characterized by a certain stochastic optimal control problem, and obtain a dual characterization of this problem in terms of a variational principle that extends Parisi's formula. As a byproduct, we obtain computationally achievable values for a class of random optimization problems including `generalized spherical perceptron' models.

Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time?

TL;DR

This work characterizes the set of low-dimensional projections of a Gaussian cloud that can be realized efficiently when the projection dimension is fixed and the ambient dimension scales proportionally with . It develops a two-stage Incremental AMP (IAMP) algorithm that yields an inner bound \\mathscr{F}^{\rm AMP}_{m,\alpha}$ on the feasible set, with a stochastic-integral representation for the resulting distributions. A duality framework connects the AMP objective to a generalized Parisi variational principle, yielding a Parisi PDE for the corresponding value function and establishing strong duality in regimes governed by a no-overlap-gap condition. The analysis combines two-stage AMP theory, stochastic optimal control, and PDE techniques to derive tractable, polynomial-time projections and to provide rigorous justifications for replica-based predictions in a high-dimensional projection setting. The results also yield computationally attainable values for random optimization problems like generalized spherical perceptron models, linking algorithmic feasibility to a Parisi-type landscape.

Abstract

Given -dimensional standard Gaussian vectors , we consider the set of all empirical distributions of its -dimensional projections, for a fixed constant. Diaconis and Freedman (1984) proved that, if , all such distributions converge to the standard Gaussian distribution. In contrast, we study the proportional asymptotics, whereby with . In this case, the projection of the data points along a typical random subspace is again Gaussian, but the set of all probability distributions that are asymptotically feasible as -dimensional projections contains non-Gaussian distributions corresponding to exceptional subspaces. Non-rigorous methods from statistical physics yield an indirect characterization of in terms of a generalized Parisi formula. Motivated by the goal of putting this formula on a rigorous basis, and to understand whether these projections can be found efficiently, we study the subset of distributions that can be realized by a class of iterative algorithms. We prove that this set is characterized by a certain stochastic optimal control problem, and obtain a dual characterization of this problem in terms of a variational principle that extends Parisi's formula. As a byproduct, we obtain computationally achievable values for a class of random optimization problems including `generalized spherical perceptron' models.
Paper Structure (28 sections, 43 theorems, 405 equations)

This paper contains 28 sections, 43 theorems, 405 equations.

Key Result

Theorem 1.1

Denote by $\mathscr{P} (\mathbb{R}^m)$ the set of all probability distributions on $\mathbb{R}^m$. Assume $E \subset \mathscr{P} (\mathbb{R}^m)$ is convex and closed under weak limit. Then, for any $\mu \in \mathscr{P} (\mathbb{R}^m)$, $\mu \in E$ if and only if for any $h \in C_b (\mathbb{R}^m)$,

Theorems & Definitions (84)

  • Theorem 1.1
  • Conjecture 2.1: Replica prediction for $\mathscr{V}_{m,\alpha}(h)$
  • Remark 1: Replica prediction for $\mathscr{V}_{1,\alpha}(h)$
  • Definition 1
  • Theorem 3.1: Inner bound for $\mathscr{F}_{m, \alpha}^{\hbox{\scriptsize\rm alg}}$
  • Corollary 3.1: Inner bound for $\mathscr{F}_{1, \alpha}^{\hbox{\scriptsize\rm alg}}$
  • Theorem 3.2: Optimal value of $H_{n, d} (\text{\boldmath $W$})$ achieved by AMP algorithms
  • Proposition 3.2
  • Definition 2: Space of functional order parameters
  • Theorem 3.3
  • ...and 74 more