Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold

Jishu Zhao; Xi Wang; Jinlong Lei

Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold

Jishu Zhao, Xi Wang, Jinlong Lei

TL;DR

This work tackles distributed stochastic optimization on the Stiefel manifold $St(n,r)$ for multi-agent networks and proposes a distributed Riemannian stochastic gradient tracking algorithm that employs a variable sample-size gradient estimator and a polar retraction to maintain feasibility. By reformulating the problem with consensus constraints and integrating a gradient-tracking dynamic, the method achieves asymptotic convergence in expectation to a stationary point (or neighborhood) under a local region and fixed step sizes. The convergence rate is characterized for exponential, polynomial, and constant sampling schedules, with explicit iteration, oracle, and communication complexities that reveal a trade-off between sampling effort and communication. Numerical PCA experiments validate the theoretical results and demonstrate competitive performance against existing Euclidean and Riemannian schemes, highlighting practical scalability on manifold-constrained distributed optimization problems.

Abstract

This paper focus on investigating the distributed Riemannian stochastic optimization problem on the Stiefel manifold for multi-agent systems, where all the agents work collaboratively to optimize a function modeled by the average of their expectation-valued local costs. Each agent only processes its own local cost function and communicate with neighboring agents to achieve optimal results while ensuring consensus. Since the local Riemannian gradient in stochastic regimes cannot be directly calculated, we will estimate the gradient by the average of a variable number of sampled gradient, which however brings about noise to the system. We then propose a distributed Riemannian stochastic optimization algorithm on the Stiefel manifold by combining the variable sample size gradient approximation method with the gradient tracking dynamic. It is worth noticing that the suitably chosen increasing sample size plays an important role in improving the algorithm efficiency, as it reduces the noise variance. In an expectation-valued sense, the iterates of all agents are proved to converge to a stationary point (or neighborhood) with fixed step sizes. We further establish the convergence rate of the iterates for the cases when the sample size is exponentially increasing, polynomial increasing, or a constant, respectively. Finally, numerical experiments are implemented to demonstrate the theoretical results.

Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold

TL;DR

This work tackles distributed stochastic optimization on the Stiefel manifold

for multi-agent networks and proposes a distributed Riemannian stochastic gradient tracking algorithm that employs a variable sample-size gradient estimator and a polar retraction to maintain feasibility. By reformulating the problem with consensus constraints and integrating a gradient-tracking dynamic, the method achieves asymptotic convergence in expectation to a stationary point (or neighborhood) under a local region and fixed step sizes. The convergence rate is characterized for exponential, polynomial, and constant sampling schedules, with explicit iteration, oracle, and communication complexities that reveal a trade-off between sampling effort and communication. Numerical PCA experiments validate the theoretical results and demonstrate competitive performance against existing Euclidean and Riemannian schemes, highlighting practical scalability on manifold-constrained distributed optimization problems.

Abstract

Paper Structure (18 sections, 15 theorems, 141 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 15 theorems, 141 equations, 5 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Contributions
Preliminaries
Riemannian geometry
Average on Stiefel Manifold
Optimality Condition
Distributed Riemannian Stochastic Gradient Tracking Method
Problem Reformulation
Algorithm Design
Main Results
Technical Lemmas
Preliminary Results
Main Results
Numerical Experiments
...and 3 more sections

Key Result

lemma 1

BoumalNicolas Let $R$ denote a retraction on $\text{St}(n,r)$. Then there exists a constant $M$ such that Moreover, if the retraction is a polar retraction, then for any $X, Y \in \text{St}(n,r)$ and $v \in T_X\mathcal{M}$, it holds doi:10.1137/20M1321000:

Figures (5)

Figure 1: Numerical results on synthetic dataset with DRSGD and Alg.\ref{['alg:1']}, eigengap $\triangle = 0.8$, Graph:ring, t= 1.
Figure 2: Iteration complexity of Alg.\ref{['alg:1']} and DRSGD
Figure 3: Oracle complexity of Alg.\ref{['alg:1']} and DRSGD
Figure 4: The performance of Alg.\ref{['alg:1']} with $N_k = [0.85^{-k}], [0.9^{-k}], [0.95^{-k}]$ under a ring graph.
Figure 5: The performance of Alg.\ref{['alg:1']} with $N_k =1, k+1, [0.9^{-k}]$ under a ring graph.

Theorems & Definitions (27)

lemma 1
remark 1
remark 2
lemma 2
Proposition 1
remark 3
remark 4
remark 5
definition 1
lemma 3
...and 17 more

Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold

TL;DR

Abstract

Distributed Riemannian Stochastic Gradient Tracking Algorithm on the Stiefel Manifold

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (27)