A biconvex optimization for solving semidefinite programs via bilinear factorization
En-Liang Hu
TL;DR
The paper addresses the scalability of semidefinite programming (SDP) by moving from the traditional quadratic factorization $Z=XX^\top$ to a bilinear factorization $Z=XY^\top$ augmented with a Courant penalty $\frac{\gamma}{2}\|X-Y\|_F^2$. A theoretical bound $\gamma>\tfrac{1}{4}(L-\sigma)_+$ ensures that stationary points of the bilinear surrogate correspond to stationary points of the original SDP under rank deficiency, linking to the Burer-Monteiro approach. The authors propose an alternating accelerated gradient descent (AAGD) algorithm to solve the resulting biconvex problem efficiently with closed-form stepsizes and low per-iteration cost. Empirical results on nonparametric kernel learning (NPKL) and colored maximum variance unfolding (CMVU) demonstrate competitive accuracy and improved convergence speed, highlighting the method's scalability to large SDP instances.
Abstract
Many problems in machine learning can be reduced to learning a low-rank positive semidefinite matrix (denoted as $Z$), which encounters semidefinite program (SDP). Existing SDP solvers by classical convex optimization are expensive to solve large-scale problems. Employing the low rank of solution, Burer-Monteiro's method reformulated SDP as a nonconvex problem via the {\emph{quadratic}} factorization ($Z$ as $XX^\top$). However, this would lose the structure of problem in optimization. In this paper, we propose to convert SDP into a biconvex problem via the {\emph{bilinear}} factorization ($Z$ as $XY^\top$), and while adding the term $\frac{\g}{2}\normfs{X-Y}$ to penalize the difference of $X$ and $Y$. Thus, the biconvex structure (w.r.t. $X$ and $Y$) can be exploited naturally in optimization. As a theoretical result, we provide a bound to the penalty parameter $\g$ under the assumption of $L$-Lipschitz smoothness and $σ$-strongly biconvexity, such that, at stationary points, the proposed bilinear factorization is equivalent to Burer-Monteiro's factorization when the bound is arrived, that is $\g>\frac{1}{4}(L-σ)_+$. Our proposal opens up a new way to surrogate SDP by biconvex program. Experiments on two SDP-related applications demonstrate that the proposed method is effective as the state-of-the-art.
