Table of Contents
Fetching ...

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

Quentin Rebjock, Nicolas Boumal

TL;DR

This work tackles optimization over matrices with bounded rank by introducing a desingularization that maps the non-smooth feasible set to a smooth manifold $\mathcal{M}$ via $\varphi$, with lifted cost $g=f\circ\varphi$. It develops a full Riemannian framework on $\mathcal{M}$, including a family of $\alpha$-weighted metrics, parsimonious tangent representations, multiple retractions (including second-order ones), and explicit gradient/Hessian expressions to enable general-purpose optimization. The authors prove global convergence for descent-type methods and establish fast local convergence through Polyak–Łojasiewicz (PL) or Morse–Bott-type conditions, even near non-maximal-rank regions, thereby achieving both global guarantees and fast local rates. Numerical experiments on matrix completion show competitive performance with strong robustness to rank overestimation, and the work provides open-source implementations to facilitate broader adoption of the desingularization approach in low-rank optimization tasks.

Abstract

Convergence guarantees for optimization over bounded-rank matrices are delicate to obtain because the feasible set is a non-smooth and non-convex algebraic variety. Existing techniques include projected gradient descent, fixed-rank optimization (over the maximal-rank stratum), and the LR parameterization. They all lack either global guarantees (the ability to accumulate only at critical points) or fast local convergence (e.g., if the limit has non-maximal rank). We seek optimization algorithms that enjoy both. Khrulkov and Oseledets [2018] parameterize the bounded-rank variety via a desingularization to recast the optimization problem onto a smooth manifold. Building on their ideas, we develop a Riemannian geometry for this desingularization, also with care for numerical considerations. We use it to secure global convergence to critical points with fast local rates, for a large range of algorithms. On matrix completion tasks, we find that this approach is comparable to others, while enjoying better general-purpose theoretical guarantees.

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

TL;DR

This work tackles optimization over matrices with bounded rank by introducing a desingularization that maps the non-smooth feasible set to a smooth manifold via , with lifted cost . It develops a full Riemannian framework on , including a family of -weighted metrics, parsimonious tangent representations, multiple retractions (including second-order ones), and explicit gradient/Hessian expressions to enable general-purpose optimization. The authors prove global convergence for descent-type methods and establish fast local convergence through Polyak–Łojasiewicz (PL) or Morse–Bott-type conditions, even near non-maximal-rank regions, thereby achieving both global guarantees and fast local rates. Numerical experiments on matrix completion show competitive performance with strong robustness to rank overestimation, and the work provides open-source implementations to facilitate broader adoption of the desingularization approach in low-rank optimization tasks.

Abstract

Convergence guarantees for optimization over bounded-rank matrices are delicate to obtain because the feasible set is a non-smooth and non-convex algebraic variety. Existing techniques include projected gradient descent, fixed-rank optimization (over the maximal-rank stratum), and the LR parameterization. They all lack either global guarantees (the ability to accumulate only at critical points) or fast local convergence (e.g., if the limit has non-maximal rank). We seek optimization algorithms that enjoy both. Khrulkov and Oseledets [2018] parameterize the bounded-rank variety via a desingularization to recast the optimization problem onto a smooth manifold. Building on their ideas, we develop a Riemannian geometry for this desingularization, also with care for numerical considerations. We use it to secure global convergence to critical points with fast local rates, for a large range of algorithms. On matrix completion tasks, we find that this approach is comparable to others, while enjoying better general-purpose theoretical guarantees.
Paper Structure (35 sections, 31 theorems, 88 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 35 sections, 31 theorems, 88 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Lemma 2.1

Given $(X, P) \in \mathcal{M}$, there exist $U \in \mathrm{St}(m, r)$, $V \in \mathrm{St}(n, r)$ and a diagonal matrix $\Sigma \in {\mathbb R}^{r \times r}$ with non-negative entries such that $X = U \Sigma V^\top$ and $P = I - V V^\top$.

Figures (4)

  • Figure 1: The desingularization manifold $\mathcal{M}$ is embedded in $\mathcal{E} = {\mathbb R}^{m \times n} \times \mathrm{Sym}(n)$. Its image through the parameterization is $\varphi(\mathcal{M}) = \mathbb{R}_{\leq r}^{m \times n}$. Problem \ref{['eq:problem']} is the minimization of $f$ over $\mathbb{R}_{\leq r}^{m \times n}$. This is executed by minimizing the lifted function $g = f \circ \varphi$.
  • Figure 2: Rank overestimation $r > r^*$ and small condition number. Our geometry is tested with various choices of $\alpha$ for the Riemannian metric \ref{['eq:inner-product']}, compared to the $LR^\top\mkern-1mu$ parameterization and optimization on the fixed-rank manifold.
  • Figure 3: Exact rank $r = r^*$ and exponential decay of target singular values.
  • Figure 4: Rank overestimation $r > r^*$ and exponential decay of target singular values.

Theorems & Definitions (70)

  • Lemma 2.1
  • proof
  • Definition 2.2
  • Lemma 2.3
  • proof
  • Proposition 2.4
  • proof
  • Proposition 2.5
  • Proposition 2.6
  • proof
  • ...and 60 more