Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time?
Andrea Montanari, Kangjie Zhou
TL;DR
This work characterizes the set of low-dimensional projections of a Gaussian cloud that can be realized efficiently when the projection dimension $m$ is fixed and the ambient dimension scales proportionally with $n/d \to \alpha$. It develops a two-stage Incremental AMP (IAMP) algorithm that yields an inner bound \\mathscr{F}^{\rm AMP}_{m,\alpha}$ on the feasible set, with a stochastic-integral representation for the resulting distributions. A duality framework connects the AMP objective to a generalized Parisi variational principle, yielding a Parisi PDE for the corresponding value function and establishing strong duality in regimes governed by a no-overlap-gap condition. The analysis combines two-stage AMP theory, stochastic optimal control, and PDE techniques to derive tractable, polynomial-time projections and to provide rigorous justifications for replica-based predictions in a high-dimensional projection setting. The results also yield computationally attainable values for random optimization problems like generalized spherical perceptron models, linking algorithmic feasibility to a Parisi-type landscape.
Abstract
Given $d$-dimensional standard Gaussian vectors $\boldsymbol{x}_1,\dots, \boldsymbol{x}_n$, we consider the set of all empirical distributions of its $m$-dimensional projections, for $m$ a fixed constant. Diaconis and Freedman (1984) proved that, if $n/d\to \infty$, all such distributions converge to the standard Gaussian distribution. In contrast, we study the proportional asymptotics, whereby $n,d\to \infty$ with $n/d\to α\in (0, \infty)$. In this case, the projection of the data points along a typical random subspace is again Gaussian, but the set $\mathscr{F}_{m,α}$ of all probability distributions that are asymptotically feasible as $m$-dimensional projections contains non-Gaussian distributions corresponding to exceptional subspaces. Non-rigorous methods from statistical physics yield an indirect characterization of $\mathscr{F}_{m,α}$ in terms of a generalized Parisi formula. Motivated by the goal of putting this formula on a rigorous basis, and to understand whether these projections can be found efficiently, we study the subset $\mathscr{F}^{\rm alg}_{m,α}\subseteq \mathscr{F}_{m,α}$ of distributions that can be realized by a class of iterative algorithms. We prove that this set is characterized by a certain stochastic optimal control problem, and obtain a dual characterization of this problem in terms of a variational principle that extends Parisi's formula. As a byproduct, we obtain computationally achievable values for a class of random optimization problems including `generalized spherical perceptron' models.
