Table of Contents
Fetching ...

Computing Bouligand stationary points efficiently in low-rank optimization

Guillaume Olikier, P. -A. Absil

TL;DR

This paper proposes a first-order algorithm that generates a sequence in the variety whose accumulation points are Bouligand stationary while requiring SVDs of matrices whose smaller dimension is always at most $r$.

Abstract

This paper considers the problem of minimizing a differentiable function with locally Lipschitz continuous gradient on the algebraic variety of all $m$-by-$n$ real matrices of rank at most $r$. Several definitions of stationarity exist for this nonconvex problem. Among them, Bouligand stationarity is the strongest necessary condition for local optimality. Only a handful of algorithms generate a sequence in the variety whose accumulation points are provably Bouligand stationary. Among them, the most parsimonious with (truncated) singular value decompositions (SVDs) or eigenvalue decompositions can still require a truncated SVD of a matrix whose rank can be as large as $\min\{m, n\}-r+1$ if the gradient does not have low rank, which is computationally prohibitive in the typical case where $r \ll \min\{m, n\}$. This paper proposes a first-order algorithm that generates a sequence in the variety whose accumulation points are Bouligand stationary while requiring SVDs of matrices whose smaller dimension is always at most $r$. A standard measure of Bouligand stationarity converges to zero along the bounded subsequences at a rate at least $O(1/\sqrt{i+1})$, where $i$ is the iteration counter. Furthermore, a rank-increasing scheme based on the proposed algorithm is presented, which can be of interest if the parameter $r$ is potentially overestimated.

Computing Bouligand stationary points efficiently in low-rank optimization

TL;DR

This paper proposes a first-order algorithm that generates a sequence in the variety whose accumulation points are Bouligand stationary while requiring SVDs of matrices whose smaller dimension is always at most .

Abstract

This paper considers the problem of minimizing a differentiable function with locally Lipschitz continuous gradient on the algebraic variety of all -by- real matrices of rank at most . Several definitions of stationarity exist for this nonconvex problem. Among them, Bouligand stationarity is the strongest necessary condition for local optimality. Only a handful of algorithms generate a sequence in the variety whose accumulation points are provably Bouligand stationary. Among them, the most parsimonious with (truncated) singular value decompositions (SVDs) or eigenvalue decompositions can still require a truncated SVD of a matrix whose rank can be as large as if the gradient does not have low rank, which is computationally prohibitive in the typical case where . This paper proposes a first-order algorithm that generates a sequence in the variety whose accumulation points are Bouligand stationary while requiring SVDs of matrices whose smaller dimension is always at most . A standard measure of Bouligand stationarity converges to zero along the bounded subsequences at a rate at least , where is the iteration counter. Furthermore, a rank-increasing scheme based on the proposed algorithm is presented, which can be of interest if the parameter is potentially overestimated.
Paper Structure (15 sections, 10 theorems, 59 equations, 2 tables, 6 algorithms)

This paper contains 15 sections, 10 theorems, 59 equations, 2 tables, 6 algorithms.

Key Result

Proposition 2.1

\newlabelprop:ProjectionOntoClosedCone0 Let $\mathcal{C} \subseteq \mathbb{R}^{m \times n}$ be a closed cone. For all $X \in \mathbb{R}^{m \times n}$ and $Y \in P_{\mathcal{C}}(X)$, thus and $P_{\mathcal{C}}(X) = \{0_{m \times n}\}$ if and only if $X \in \mathcal{C}^*$.

Theorems & Definitions (20)

  • Proposition 2.1: LevinKileelBoumal2023
  • Proposition 2.2: GolubVanLoan
  • Definition 2.3
  • Definition 2.4
  • Proposition 3.1
  • Proof 1
  • Corollary 3.2
  • Proof 2
  • Lemma 4.1
  • Proof 3
  • ...and 10 more