Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach
Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, Sujay Sanghavi
TL;DR
This work studies non-square matrix sensing under Restricted Isometry Property (RIP) and analyzes a non-convex UV factorization X = UV^T. By reformulating with W = [U; V] and adding a balancing regularizer g(W) with λ = 1/4, the authors show that, under RIP, the non-convex parametrization does not introduce spurious local minima; any first- and second-order stationary point must be globally close to the balanced factorization of the true X^*. A key result provides an explicit bound: if W is first- and second-order optimal, then (1 - 5δ_{2r} - 544δ_{4r}^2 - 1088δ_{2r}δ_{4r}^2)/(8(40+68δ_{2r})(1+δ_{2r})) · ||WW^T - W^*W^{*T}||_F^2 ≤ ||A(U^*V^{* op}) - b||^2, implying convergence to the true X^* in noiseless cases when δ's are small. The paper also extends the discussion to noisy and high-rank settings, and establishes a strict saddle property to ensure gradient-based methods escape non-global saddles and reliably approach the global optimum in practical scenarios. Overall, the results support the effectiveness of the Burer-Monteiro approach for non-square matrix sensing by preserving favorable landscape properties under RIP.
Abstract
We consider the non-square matrix sensing problem, under restricted isometry property (RIP) assumptions. We focus on the non-convex formulation, where any rank-$r$ matrix $X \in \mathbb{R}^{m \times n}$ is represented as $UV^\top$, where $U \in \mathbb{R}^{m \times r}$ and $V \in \mathbb{R}^{n \times r}$. In this paper, we complement recent findings on the non-convex geometry of the analogous PSD setting [5], and show that matrix factorization does not introduce any spurious local minima, under RIP.
