Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold
Jishu Zhao, Xi Wang, Jinlong Lei
TL;DR
This work tackles distributed stochastic optimization on the Stiefel manifold $St(n,r)$ for multi-agent networks and proposes a distributed Riemannian stochastic gradient tracking algorithm that employs a variable sample-size gradient estimator and a polar retraction to maintain feasibility. By reformulating the problem with consensus constraints and integrating a gradient-tracking dynamic, the method achieves asymptotic convergence in expectation to a stationary point (or neighborhood) under a local region and fixed step sizes. The convergence rate is characterized for exponential, polynomial, and constant sampling schedules, with explicit iteration, oracle, and communication complexities that reveal a trade-off between sampling effort and communication. Numerical PCA experiments validate the theoretical results and demonstrate competitive performance against existing Euclidean and Riemannian schemes, highlighting practical scalability on manifold-constrained distributed optimization problems.
Abstract
This paper focus on investigating the distributed Riemannian stochastic optimization problem on the Stiefel manifold for multi-agent systems, where all the agents work collaboratively to optimize a function modeled by the average of their expectation-valued local costs. Each agent only processes its own local cost function and communicate with neighboring agents to achieve optimal results while ensuring consensus. Since the local Riemannian gradient in stochastic regimes cannot be directly calculated, we will estimate the gradient by the average of a variable number of sampled gradient, which however brings about noise to the system. We then propose a distributed Riemannian stochastic optimization algorithm on the Stiefel manifold by combining the variable sample size gradient approximation method with the gradient tracking dynamic. It is worth noticing that the suitably chosen increasing sample size plays an important role in improving the algorithm efficiency, as it reduces the noise variance. In an expectation-valued sense, the iterates of all agents are proved to converge to a stationary point (or neighborhood) with fixed step sizes. We further establish the convergence rate of the iterates for the cases when the sample size is exponentially increasing, polynomial increasing, or a constant, respectively. Finally, numerical experiments are implemented to demonstrate the theoretical results.
