Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

Quentin Rebjock; Nicolas Boumal

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

Quentin Rebjock, Nicolas Boumal

TL;DR

This work tackles optimization over matrices with bounded rank by introducing a desingularization that maps the non-smooth feasible set to a smooth manifold $\mathcal{M}$ via $\varphi$, with lifted cost $g=f\circ\varphi$. It develops a full Riemannian framework on $\mathcal{M}$, including a family of $\alpha$-weighted metrics, parsimonious tangent representations, multiple retractions (including second-order ones), and explicit gradient/Hessian expressions to enable general-purpose optimization. The authors prove global convergence for descent-type methods and establish fast local convergence through Polyak–Łojasiewicz (PL) or Morse–Bott-type conditions, even near non-maximal-rank regions, thereby achieving both global guarantees and fast local rates. Numerical experiments on matrix completion show competitive performance with strong robustness to rank overestimation, and the work provides open-source implementations to facilitate broader adoption of the desingularization approach in low-rank optimization tasks.

Abstract

Convergence guarantees for optimization over bounded-rank matrices are delicate to obtain because the feasible set is a non-smooth and non-convex algebraic variety. Existing techniques include projected gradient descent, fixed-rank optimization (over the maximal-rank stratum), and the LR parameterization. They all lack either global guarantees (the ability to accumulate only at critical points) or fast local convergence (e.g., if the limit has non-maximal rank). We seek optimization algorithms that enjoy both. Khrulkov and Oseledets [2018] parameterize the bounded-rank variety via a desingularization to recast the optimization problem onto a smooth manifold. Building on their ideas, we develop a Riemannian geometry for this desingularization, also with care for numerical considerations. We use it to secure global convergence to critical points with fast local rates, for a large range of algorithms. On matrix completion tasks, we find that this approach is comparable to others, while enjoying better general-purpose theoretical guarantees.

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

TL;DR

This work tackles optimization over matrices with bounded rank by introducing a desingularization that maps the non-smooth feasible set to a smooth manifold

via

, with lifted cost

. It develops a full Riemannian framework on

, including a family of

-weighted metrics, parsimonious tangent representations, multiple retractions (including second-order ones), and explicit gradient/Hessian expressions to enable general-purpose optimization. The authors prove global convergence for descent-type methods and establish fast local convergence through Polyak–Łojasiewicz (PL) or Morse–Bott-type conditions, even near non-maximal-rank regions, thereby achieving both global guarantees and fast local rates. Numerical experiments on matrix completion show competitive performance with strong robustness to rank overestimation, and the work provides open-source implementations to facilitate broader adoption of the desingularization approach in low-rank optimization tasks.

Abstract

Paper Structure (35 sections, 31 theorems, 88 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 35 sections, 31 theorems, 88 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Contributions
Related work
Context and applications.
Algorithms.
Global convergence guarantees.
Local convergence guarantees.
Positive semidefinite case.
Notation
Geometry of the desingularization
Smooth structure and tangent spaces
Retractions
Geometry of fibers and preimages
Riemannian geometry
A family of metrics with parameter $\alpha$
...and 20 more sections

Key Result

Lemma 2.1

Given $(X, P) \in \mathcal{M}$, there exist $U \in \mathrm{St}(m, r)$, $V \in \mathrm{St}(n, r)$ and a diagonal matrix $\Sigma \in {\mathbb R}^{r \times r}$ with non-negative entries such that $X = U \Sigma V^\top$ and $P = I - V V^\top$.

Figures (4)

Figure 1: The desingularization manifold $\mathcal{M}$ is embedded in $\mathcal{E} = {\mathbb R}^{m \times n} \times \mathrm{Sym}(n)$. Its image through the parameterization is $\varphi(\mathcal{M}) = \mathbb{R}_{\leq r}^{m \times n}$. Problem \ref{['eq:problem']} is the minimization of $f$ over $\mathbb{R}_{\leq r}^{m \times n}$. This is executed by minimizing the lifted function $g = f \circ \varphi$.
Figure 2: Rank overestimation $r > r^*$ and small condition number. Our geometry is tested with various choices of $\alpha$ for the Riemannian metric \ref{['eq:inner-product']}, compared to the $LR^\top\mkern-1mu$ parameterization and optimization on the fixed-rank manifold.
Figure 3: Exact rank $r = r^*$ and exponential decay of target singular values.
Figure 4: Rank overestimation $r > r^*$ and exponential decay of target singular values.

Theorems & Definitions (70)

Lemma 2.1
proof
Definition 2.2
Lemma 2.3
proof
Proposition 2.4
proof
Proposition 2.5
Proposition 2.6
proof
...and 60 more

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

TL;DR

Abstract

Optimization over bounded-rank matrices through a desingularization enables joint global and local guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (70)