Table of Contents
Fetching ...

An efficient algorithm for the Riemannian logarithm on the Stiefel manifold for a family of Riemannian metrics

Simon Mataigne, Ralf Zimmermann, Nina Miolane

TL;DR

This work tackles the problem of computing the Riemannian logarithm on the Stiefel manifold under a one-parameter family of metrics that includes the Euclidean and canonical cases. It generalizes Zimmermann's efficient matrix-algebraic geodesic algorithm to the $\beta$-metric family, presenting backward and forward iterations, with rigorous local linear convergence results and practical forward variants that avoid heavy nonlinear matrix solves. The paper introduces a BCH-based analysis, efficient initializations, and a quasi-geodesic subproblem framework, accompanied by extensive numerical experiments showing that the accelerated forward variant often outperforms shooting-based methods, especially for a wide range of $\beta$. The work provides reproducible code and benchmarks, and offers insights into convergence radii via probabilistic modeling, highlighting the method's potential for robust and scalable geodesic computations on the Stiefel manifold in optimization, statistics, and machine learning contexts.

Abstract

Since the popularization of the Stiefel manifold for numerical applications in 1998 in a seminal paper from Edelman et al., it has been exhibited to be a key to solve many problems from optimization, statistics and machine learning. In 2021, Hüper et al. proposed a one-parameter family of Riemannian metrics on the Stiefel manifold, subsuming the well-known Euclidean and canonical metrics. Since then, several methods have been proposed to obtain a candidate for the Riemannian logarithm given any metric from the family. Most of these methods are based on the shooting method or rely on optimization approaches. For the canonical metric, Zimmermann proposed in 2017 a particularly efficient method based on a pure matrix-algebraic approach. In this paper, we derive a generalization of this algorithm that works for the one-parameter family of Riemannian metrics. The algorithm is proposed in two versions, termed backward and forward, for which we prove that it conserves the local linear convergence previously exhibited in Zimmermann's algorithm for the canonical metric.

An efficient algorithm for the Riemannian logarithm on the Stiefel manifold for a family of Riemannian metrics

TL;DR

This work tackles the problem of computing the Riemannian logarithm on the Stiefel manifold under a one-parameter family of metrics that includes the Euclidean and canonical cases. It generalizes Zimmermann's efficient matrix-algebraic geodesic algorithm to the -metric family, presenting backward and forward iterations, with rigorous local linear convergence results and practical forward variants that avoid heavy nonlinear matrix solves. The paper introduces a BCH-based analysis, efficient initializations, and a quasi-geodesic subproblem framework, accompanied by extensive numerical experiments showing that the accelerated forward variant often outperforms shooting-based methods, especially for a wide range of . The work provides reproducible code and benchmarks, and offers insights into convergence radii via probabilistic modeling, highlighting the method's potential for robust and scalable geodesic computations on the Stiefel manifold in optimization, statistics, and machine learning contexts.

Abstract

Since the popularization of the Stiefel manifold for numerical applications in 1998 in a seminal paper from Edelman et al., it has been exhibited to be a key to solve many problems from optimization, statistics and machine learning. In 2021, Hüper et al. proposed a one-parameter family of Riemannian metrics on the Stiefel manifold, subsuming the well-known Euclidean and canonical metrics. Since then, several methods have been proposed to obtain a candidate for the Riemannian logarithm given any metric from the family. Most of these methods are based on the shooting method or rely on optimization approaches. For the canonical metric, Zimmermann proposed in 2017 a particularly efficient method based on a pure matrix-algebraic approach. In this paper, we derive a generalization of this algorithm that works for the one-parameter family of Riemannian metrics. The algorithm is proposed in two versions, termed backward and forward, for which we prove that it conserves the local linear convergence previously exhibited in Zimmermann's algorithm for the canonical metric.
Paper Structure (33 sections, 7 theorems, 71 equations, 7 figures, 1 table, 4 algorithms)

This paper contains 33 sections, 7 theorems, 71 equations, 7 figures, 1 table, 4 algorithms.

Key Result

Theorem 2.1

The Riemannian exponentialZimmermannRalf22. \newlabelthm:geodesics0 For all ${U}\in \mathrm{St}(n,p)$ and ${\Delta} \in T_{U}\mathrm{St}(n,p)$, we have where ${A} = {U}^T {\Delta}\in \mathrm{Skew}(p)$ and ${Q}{B} = ({I}-{U}{U}^T){\Delta}\in\mathbb{R}^{n\times p}$ is any matrix decomposition where ${Q}\in\mathrm{St}(n,n-p)$ with ${Q}^T {U}={0}$ and ${B}\in\mathbb{R}^{(n-p)\times p}$.

Figures (7)

  • Figure 1: Conceptual illustration of the Stiefel manifold $\mathrm{St}(n,p)$, the tangent space $T_{U}\mathrm{St}(n,p)$, the exponential map $\mathrm{Exp}_{\beta,{U}}({\Delta})$ and the logarithmic map $\mathrm{Log}_{\beta,{U}}(\widetilde{U})$ (see Section \ref{['subsec:exponential']}).
  • Figure 1: Open set of pairs $(\beta,\delta)$ satisfying \ref{['cond:delta']}. \ref{['cond:delta']} gathers sufficient conditions to ensure \ref{['thm:convergence']} on the convergence of the backward \ref{['alg:generalizedStiefel']}.
  • Figure 1: Convergence of \ref{['alg:generalizedStiefel']} with an accelerated forward iteration (${Q}_k = \exp((2\beta-1){A}_k)$) from \ref{['sec:fastforward']} (solid lines) and the fixed forward iteration (stylized lines) on $\mathrm{St}(n=120,p=50)$. The matrices ${U},\widetilde{U}$ are randomly generated at Frobenius distance $\|{U}-\widetilde{U}\|_\mathrm{F} \in \{0.03,0.19,0.37\}\cdot2\sqrt{p}$ for respectively the top left, top right and bottom left plots. The stars " $*$" on the y-axes specify that the residuals are normalized by the residual of the first iteration. The bottom right figure shows how the improvement factor (i.e., the ratio between the number of iterations of the fixed forward and the accelerated forward method) varies as the Frobenius distance increases in $\mathrm{St}(n=60, p=30)$.
  • Figure 1: On the left, the logistic regression models on 1000 random samples on $\mathrm{St}(32,16)$. The models are trained until $\|\nabla f(\theta)\|_F<10^{-8}$ (see \ref{['app:logisticmodel']}). The logistic model estimates the probability of success of \ref{['alg:generalizedStiefel']} (implemented with 2 pseudo-backward sub-iterations). Respectively for $\beta$ from $0.6$ to $1$, the R2 factors of the fitting are $\{0.85, 0.97,0.96,0.98,0.97\}$, confirming the goodness of fit. On the right, the evolution of the radius of convergence as $p$ varies on $\mathrm{St}(2p,p)$.
  • Figure 2: An artist view of the geometric construction of the momentum $\Delta_k$. On the left, the figure shows a construction that is equivalent to \ref{['eq:momentum_def']} in the context of a flat space, e.g., $\mathrm{Skew}(p)$ equipped with the standard inner product $\langle\cdot,\cdot\rangle_\mathrm{F}$. On the right, the figure illustrates the analogous construction of $\Delta_k$ in the context of the Riemannian manifold $\mathrm{SO}(p)$ viewed as a subset of $(\mathrm{St}(p, p)$,$\langle\cdot,\cdot\rangle_{\beta=1})$.
  • ...and 2 more figures

Theorems & Definitions (17)

  • Theorem 2.1
  • Remark 3.1
  • Theorem 3.2
  • Remark 4.1
  • Remark 4.2
  • Theorem 4.3
  • Proof 1
  • Remark 4.5
  • Remark 6.1
  • Proposition A.1
  • ...and 7 more