New vector transport operators extending a Riemannian CG algorithm to generalized Stiefel manifold with low-rank applications
Xuejie Wang, Kangkang Deng, Zheng Peng, Chengcheng Yan
TL;DR
This work tackles optimization under generalized orthogonality constraints on the generalized Stiefel manifold $\operatorname{St}_M(n,p)$ with a non-standard metric $\langle \xi,\eta\rangle_X = \operatorname{tr}(\xi^{\top}M\eta)$. It introduces two Cayley-transform–based vector transports (differentiated retraction and an isometric variant) and proves Ring-Wirth non-expansiveness, with one transport being isometric, enabling a robust Riemannian conjugate gradient method equipped with a non-monotone line search that globally converges to stationary points. Building on these transports, the modified PRP CG algorithm is extended to the generalized Stiefel setting and its convergence is established under standard assumptions. Numerical experiments on generalized eigenvalue problems and canonical correlation analysis demonstrate that the proposed approach achieves competitive or superior efficiency, particularly for large-scale low-rank problems. Overall, the paper advances scalable optimization on generalized orthogonality constraints by integrating Cayley-transform vector transports with a non-standard metric and a robust CG framework.
Abstract
This paper proposes two innovative vector transport operators, leveraging the Cayley transform, for the generalized Stiefel manifold embedded with a non-standard metric. Specifically, it introduces the differentiated retraction and an approximation of the Cayley transform to the differentiated matrix exponential. These vector transports are demonstrated to satisfy the Ring-Wirth non-expansive condition under non-standard metrics, and one of them is also isometric. Building upon the novel vector transport operators, we extend the modified Polak-Ribi$\grave{e}$re-Polyak (PRP) conjugate gradient method to the generalized Stiefel manifold. Under a non-monotone line search condition, we prove our algorithm globally converges to a stationary point. The efficiency of the proposed vector transport operators is empirically validated through numerical experiments involving generalized eigenvalue problems and canonical correlation analysis.
