Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor PCA
Gérard Ben Arous, Cédric Gerbelot, Vanessa Piccolo
TL;DR
This work analyzes Langevin dynamics for high-dimensional nonconvex optimization in multi-spiked tensor PCA, reducing the complex landscape to low-dimensional autonomous dynamics governed by correlations $m_{ij}$ and the Gram matrix $\boldsymbol{G}$. The authors establish sharp sample-complexity thresholds and SNR-separation conditions distinguishing $p\ge 3$ from $p=2$: leading-spike recovery occurs at $M\sim N^{p-2}$ (and all-spikes recovery at $M\sim N^{p-1}$) for $p\ge 3$, while for $p=2$ the leading spike is recoverable with $M= N^{\delta}$ for any $\delta>0$ and full recovery depends on the SNR ratios; equal-SNR $p=2$ yields subspace recovery results. A central methodological contribution is the bounding-flows approach, which, together with Itô calculus on the Stiefel manifold and Doob-type martingale bounds, yields precise control of the low-dimensional order parameters and a sequential-elimination phenomenon in spike recovery. The results illuminate statistical-to-computational gaps and connect to gradient-flow and SGD analyses in companion papers, offering rigorous insight into the dynamics of nonconvex high-dimensional estimation on manifolds.
Abstract
We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine the necessary separation condition on the signal-to-noise ratios (SNRs) for exact recovery, distinguishing the cases $p \ge 3$ and $p=2$, where $p$ denotes the order of the tensor. In particular, we show that the sample complexity required for recovering the spike associated with the largest SNR matches the well-known algorithmic threshold for the single-spike case, while this threshold degrades when recovering all $r$ spikes. As a key step, we provide a detailed characterization of the trajectory and interactions of low-dimensional projections that capture the high-dimensional dynamics.
