Low-Rank Mirror-Prox for Nonsmooth and Low-Rank Matrix Optimization Problems

Dan Garber; Atara Kaplan

Low-Rank Mirror-Prox for Nonsmooth and Low-Rank Matrix Optimization Problems

Dan Garber, Atara Kaplan

TL;DR

This work addresses convex nonsmooth matrix optimization over the spectrahedron, showing that, under a mild strict complementarity assumption, saddle-point reformulations enable efficient low-rank methods. By developing approximated mirror-prox algorithms—one with Euclidean updates (extragradient) and another with matrix exponentiated gradient updates (MEG)—the authors prove an $O(1/t)$ convergence rate while requiring only two rank-$r$ SVDs per iteration, provided a warm-start close to an optimal low-rank saddle point. They also derive practical, efficiently computable certificates to verify the low-rank projections and establish conditions under which these approximations coincide with exact updates. Empirical results on tasks like sparse PCA, robust PCA, phase synchronization, and low-rank matrix recovery demonstrate both the plausibility of the SC/GSC assumptions and the competitive, scalable performance of the proposed low-rank mirror-prox algorithms. Overall, the paper delivers a principled and scalable approach for nonsmooth low-rank matrix optimization with strong theoretical guarantees and compelling empirical validation.

Abstract

Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning. While significant progress has been made in recent years in developing efficient methods for \textit{smooth} low-rank optimization problems that avoid maintaining high-rank matrices and computing expensive high-rank SVDs, advances for nonsmooth problems have been slow paced. In this paper we consider standard convex relaxations for such problems. Mainly, we prove that under a \textit{strict complementarity} condition and under the relatively mild assumption that the nonsmooth objective can be written as a maximum of smooth functions, approximated variants of two popular \textit{mirror-prox} methods: the Euclidean \textit{extragradient method} and mirror-prox with \textit{matrix exponentiated gradient updates}, when initialized with a "warm-start", converge to an optimal solution with rate $O(1/t)$, while requiring only two \textit{low-rank} SVDs per iteration. Moreover, for the extragradient method we also consider relaxed versions of strict complementarity which yield a trade-off between the rank of the SVDs required and the radius of the ball in which we need to initialize the method. We support our theoretical results with empirical experiments on several nonsmooth low-rank matrix recovery tasks, demonstrating both the plausibility of the strict complementarity assumption, and the efficient convergence of our proposed low-rank mirror-prox variants.

Low-Rank Mirror-Prox for Nonsmooth and Low-Rank Matrix Optimization Problems

TL;DR

convergence rate while requiring only two rank-

SVDs per iteration, provided a warm-start close to an optimal low-rank saddle point. They also derive practical, efficiently computable certificates to verify the low-rank projections and establish conditions under which these approximations coincide with exact updates. Empirical results on tasks like sparse PCA, robust PCA, phase synchronization, and low-rank matrix recovery demonstrate both the plausibility of the SC/GSC assumptions and the competitive, scalable performance of the proposed low-rank mirror-prox algorithms. Overall, the paper delivers a principled and scalable approach for nonsmooth low-rank matrix optimization with strong theoretical guarantees and compelling empirical validation.

Abstract

, while requiring only two \textit{low-rank} SVDs per iteration. Moreover, for the extragradient method we also consider relaxed versions of strict complementarity which yield a trade-off between the rank of the SVDs required and the radius of the ball in which we need to initialize the method. We support our theoretical results with empirical experiments on several nonsmooth low-rank matrix recovery tasks, demonstrating both the plausibility of the strict complementarity assumption, and the efficient convergence of our proposed low-rank mirror-prox variants.

Paper Structure (39 sections, 21 theorems, 186 equations, 3 figures, 5 tables, 5 algorithms)

This paper contains 39 sections, 21 theorems, 186 equations, 3 figures, 5 tables, 5 algorithms.

Introduction
Additional related work
Organization of this paper
Strict Complementarity for Nonsmooth Optimization and Difficulty of Using Low-Rank Projected Subgradient Steps
The challenge of applying low-rank projected subgradient steps
From Nonsmooth to Saddle-Point Problems
Approximated Mirror-Prox for Saddle-Point Problems
Bregman distances and mirror-prox methods
Bregman distances for the spectrahedron
Euclidean distance:
Bregman distance corresponding to the von Neumann entropy:
Approximated mirror-prox method
Projected Extragradient Method with Low-Rank Projections
Efficiently-computable certificates for correctness of low-rank Euclidean projections
Mirror-Prox with Low-Rank Matrix Exponentiated Gradient Updates
...and 24 more sections

Key Result

Lemma 1

Let $g:\mathbb{S}^n\rightarrow\mathbb{R}$ be a convex function. Then ${\mathbf{X}}^*\in{\mathcal{S}_n}$ minimizes $g$ over ${\mathcal{S}_n}$ if and only if there exists a subgradient ${\mathbf{G}}^*\in\partial g({\mathbf{X}}^*)$ such that $\langle {\mathbf{X}}-{\mathbf{X}}^*,{\mathbf{G}}^*\rangle\ge

Figures (3)

Figure 3: Comparison between the projected subgradient decent (SD), low-rank projected subgradient decent (low-rank SD), and low-rank projected extragradient (low-rank EG).
Figure : $\lambda=0.001$, $\textnormal{rank}({\mathbf{M}})=1$, $\textnormal{rank}({\mathbf{A}}_i)=1$, $c=5000$
Figure : $\eta=0.1$, $\lambda=0.001$, $\textnormal{rank}({\mathbf{M}})=1$, $\textnormal{rank}({\mathbf{A}}_i)=1$, $c=5000$

Theorems & Definitions (39)

Lemma 1: first-order optimality condition, see beckOptimizationBook
Definition 1: strict complementarity
Lemma 2
Lemma 3
Lemma 4: failure of subgradient descent with low-rank projections on sparse PCA
Lemma 5
Remark 1
Definition 2: Bregman distance
Lemma 6
Theorem 1: main theorem
...and 29 more

Low-Rank Mirror-Prox for Nonsmooth and Low-Rank Matrix Optimization Problems

TL;DR

Abstract

Low-Rank Mirror-Prox for Nonsmooth and Low-Rank Matrix Optimization Problems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (39)