Decentralized projected Riemannian stochastic recursive momentum method for nonconvex optimization
Kangkang Deng, Jiang Hu
TL;DR
This work addresses decentralized online nonconvex optimization on compact submanifolds by proposing DPRSRM, a single-loop method that blends gradient tracking with a momentum-based variance-reduced estimator and projection-based updates on $\mathcal{M}$. The algorithm achieves an oracle complexity of $O(\varepsilon^{-3/2})$ with $O(1)$ gradient evaluations per node per iteration, and its convergence is established via separate consensus and optimality analyses. Corollaries show practical parameter choices yield the claimed rate without large minibatch restarts, and numerical experiments on decentralized PCA and LRMC demonstrate superiority over state-of-the-art online decentralized manifold methods. The results imply efficient, scalable decentralized optimization on manifolds for streaming data applications.
Abstract
This paper studies decentralized optimization over a compact submanifold within a communication network of $n$ nodes, where each node possesses a smooth non-convex local cost function, and the goal is to jointly minimize the sum of these local costs. We focus particularly on the online setting, where local data is processed in real-time as it streams in, without the need for full data storage. We propose a decentralized projected Riemannian stochastic recursive momentum (DPRSRM) method that employs local hybrid stochastic gradient estimators and uses the network to track the global gradient. DPRSRM achieves an oracle complexity of \(\mathcal{O}(ε^{-\frac{3}{2}})\), outperforming existing methods that have at most \(\mathcal{O}(ε^{-2})\) complexity. Our method requires only $\mathcal{O}(1)$ gradient evaluations per iteration for each local node and does not require restarting with a large batch gradient. Furthermore, we demonstrate the effectiveness of our proposed methods compared to state-of-the-art ones through numerical experiments on principal component analysis problems and low-rank matrix completion.
