Improved Global Guarantees for the Nonconvex Burer--Monteiro Factorization via Rank Overparameterization
Richard Y. Zhang
TL;DR
This work analyzes the nonconvex Burer–Monteiro factorization for semidefinite-program-like objectives by studying $f(X)=φ(XX^{T})$ with $φ$ $L$-smooth and $μ$-strongly convex. It proves that a constant-factor overparameterization, specifically $r>rac{1}{4}(L/μ-1)^{2}r^{igstar}$, eliminates spurious local minima, enabling global convergence from arbitrary initializations and surpassing the traditional $r\ge n$ threshold. A corollary shows that in the exact-parameterization regime with favorable conditioning ($L/μ<3$), no spurious local minima arise for $rigstar\,igleq r$, highlighting a sharp dependence on conditioning. The authors develop a two-stage SDP bounding framework and a valid inequality relating invariants α,β to characterize counterexamples, providing rigorous insight into how modest overparameterization reshapes the optimization landscape and informs algorithmic design for large-scale SDP-like problems.
Abstract
We consider minimizing a twice-differentiable, $L$-smooth, and $μ$-strongly convex objective $φ$ over an $n\times n$ positive semidefinite matrix $M\succeq0$, under the assumption that the minimizer $M^{\star}$ has low rank $r^{\star}\ll n$. Following the Burer--Monteiro approach, we instead minimize the nonconvex objective $f(X)=φ(XX^{T})$ over a factor matrix $X$ of size $n\times r$. This substantially reduces the number of variables from $O(n^{2})$ to as few as $O(n)$ and also enforces positive semidefiniteness for free, but at the cost of giving up the convexity of the original problem. In this paper, we prove that if the search rank $r\ge r^{\star}$ is overparameterized by a \emph{constant factor} with respect to the true rank $r^{\star}$, namely as in $r>\frac{1}{4}(L/μ-1)^{2}r^{\star}$, then despite nonconvexity, local optimization is guaranteed to globally converge from any initial point to the global optimum. This significantly improves upon a previous rank overparameterization threshold of $r\ge n$, which we show is sharp in the absence of smoothness and strong convexity, but would increase the number of variables back up to $O(n^{2})$. Conversely, without rank overparameterization, we prove that such a global guarantee is possible if and only if $φ$ is almost perfectly conditioned, with a condition number of $L/μ<3$. Therefore, we conclude that a small amount of overparameterization can lead to large improvements in theoretical guarantees for the nonconvex Burer--Monteiro factorization.
