Retraction-Free Decentralized Non-convex Optimization with Orthogonal Constraints
Youbang Sun, Shixiang Chen, Alfredo Garcia, Shahin Shahrampour
TL;DR
This work tackles decentralized non-convex optimization with orthogonal constraints on the Stiefel manifold, where traditional projection or retraction steps are computationally expensive. It introduces the Decentralized Retraction-Free Gradient Tracking (DRFGT) algorithm, a fully decentralized, infeasible-but-convergent method that uses a landing-field update to drive iterates toward feasibility without retractions. The authors establish an ergodic $\mathcal{O}(1/K)$ convergence rate and, under a local Riemannian PŁ condition, a local linear convergence rate, along with a safe-step-size analysis that ensures iterates stay within a neighborhood of the manifold. Numerical experiments on decentralized PCA with synthetic and real data corroborate the theory, showing DRFGT achieves competitive accuracy with substantially reduced computational overhead and favorable CPU-time performance compared to retraction-based methods.
Abstract
In this paper, we investigate decentralized non-convex optimization with orthogonal constraints. Conventional algorithms for this setting require either manifold retractions or other types of projection to ensure feasibility, both of which involve costly linear algebra operations (e.g., SVD or matrix inversion). On the other hand, infeasible methods are able to provide similar performance with higher computational efficiency. Inspired by this, we propose the first decentralized version of the retraction-free landing algorithm, called \textbf{D}ecentralized \textbf{R}etraction-\textbf{F}ree \textbf{G}radient \textbf{T}racking (DRFGT). We theoretically prove that DRFGT enjoys the ergodic convergence rate of $\mathcal{O}(1/K)$, matching the convergence rate of centralized, retraction-based methods. We further establish that under a local Riemannian PŁ condition, DRFGT achieves a much faster linear convergence rate. Numerical experiments demonstrate that DRFGT performs on par with the state-of-the-art retraction-based methods with substantially reduced computational overhead.
