Can Learning Be Explained By Local Optimality In Robust Low-rank Matrix Recovery?
Jianhao Ma, Salar Fattahi
TL;DR
This work analyzes robust low-rank matrix recovery under nonsmooth $\ell_1$-loss, showing that ground-truth matrices $X^\star$ typically do not appear as local optima but as strict saddles in many regimes. By formulating a Burer–Monteiro factorization and leveraging parametric perturbation constructions, the authors establish precise sample-size thresholds separating regimes where true solutions are non-optimal, strict saddles, or global minima across symmetric/asymmetric sensing and completion. Key contributions include tight landscape characterizations under Gaussian sensing and elementwise completion, and proofs of matching lower bounds, demonstrating the nontrivial role of rank, coherence, and noise in shaping the optimization geometry. The findings challenge the belief that saddle points are universally detrimental and highlight nuanced implications for learning dynamics, including potential explanations for why simple subgradient methods can converge to true solutions in the presence of outliers, as well as directions for future work on saddle-escape and early stopping strategies.
Abstract
We explore the local landscape of low-rank matrix recovery, focusing on reconstructing a $d_1\times d_2$ matrix $X^\star$ with rank $r$ from $m$ linear measurements, some potentially noisy. When the noise is distributed according to an outlier model, minimizing a nonsmooth $\ell_1$-loss with a simple sub-gradient method can often perfectly recover the ground truth matrix $X^\star$. Given this, a natural question is what optimization property (if any) enables such learning behavior. The most plausible answer is that the ground truth $X^\star$ manifests as a local optimum of the loss function. In this paper, we provide a strong negative answer to this question, showing that, under moderate assumptions, the true solutions corresponding to $X^\star$ do not emerge as local optima, but rather as strict saddle points -- critical points with strictly negative curvature in at least one direction. Our findings challenge the conventional belief that all strict saddle points are undesirable and should be avoided.
