Table of Contents
Fetching ...

Decentralized projected Riemannian stochastic recursive momentum method for nonconvex optimization

Kangkang Deng, Jiang Hu

TL;DR

This work addresses decentralized online nonconvex optimization on compact submanifolds by proposing DPRSRM, a single-loop method that blends gradient tracking with a momentum-based variance-reduced estimator and projection-based updates on $\mathcal{M}$. The algorithm achieves an oracle complexity of $O(\varepsilon^{-3/2})$ with $O(1)$ gradient evaluations per node per iteration, and its convergence is established via separate consensus and optimality analyses. Corollaries show practical parameter choices yield the claimed rate without large minibatch restarts, and numerical experiments on decentralized PCA and LRMC demonstrate superiority over state-of-the-art online decentralized manifold methods. The results imply efficient, scalable decentralized optimization on manifolds for streaming data applications.

Abstract

This paper studies decentralized optimization over a compact submanifold within a communication network of $n$ nodes, where each node possesses a smooth non-convex local cost function, and the goal is to jointly minimize the sum of these local costs. We focus particularly on the online setting, where local data is processed in real-time as it streams in, without the need for full data storage. We propose a decentralized projected Riemannian stochastic recursive momentum (DPRSRM) method that employs local hybrid stochastic gradient estimators and uses the network to track the global gradient. DPRSRM achieves an oracle complexity of \(\mathcal{O}(ε^{-\frac{3}{2}})\), outperforming existing methods that have at most \(\mathcal{O}(ε^{-2})\) complexity. Our method requires only $\mathcal{O}(1)$ gradient evaluations per iteration for each local node and does not require restarting with a large batch gradient. Furthermore, we demonstrate the effectiveness of our proposed methods compared to state-of-the-art ones through numerical experiments on principal component analysis problems and low-rank matrix completion.

Decentralized projected Riemannian stochastic recursive momentum method for nonconvex optimization

TL;DR

This work addresses decentralized online nonconvex optimization on compact submanifolds by proposing DPRSRM, a single-loop method that blends gradient tracking with a momentum-based variance-reduced estimator and projection-based updates on . The algorithm achieves an oracle complexity of with gradient evaluations per node per iteration, and its convergence is established via separate consensus and optimality analyses. Corollaries show practical parameter choices yield the claimed rate without large minibatch restarts, and numerical experiments on decentralized PCA and LRMC demonstrate superiority over state-of-the-art online decentralized manifold methods. The results imply efficient, scalable decentralized optimization on manifolds for streaming data applications.

Abstract

This paper studies decentralized optimization over a compact submanifold within a communication network of nodes, where each node possesses a smooth non-convex local cost function, and the goal is to jointly minimize the sum of these local costs. We focus particularly on the online setting, where local data is processed in real-time as it streams in, without the need for full data storage. We propose a decentralized projected Riemannian stochastic recursive momentum (DPRSRM) method that employs local hybrid stochastic gradient estimators and uses the network to track the global gradient. DPRSRM achieves an oracle complexity of \(\mathcal{O}(ε^{-\frac{3}{2}})\), outperforming existing methods that have at most \(\mathcal{O}(ε^{-2})\) complexity. Our method requires only gradient evaluations per iteration for each local node and does not require restarting with a large batch gradient. Furthermore, we demonstrate the effectiveness of our proposed methods compared to state-of-the-art ones through numerical experiments on principal component analysis problems and low-rank matrix completion.

Paper Structure

This paper contains 24 sections, 13 theorems, 82 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Lemma 3.2

\newlabellemma:lipsctz0 Under Assumption assum-f, there exists $L_g$, for any $x,y\in \mathcal{M}$, the following inequality holds: Moreover, there exists a constant $L_G>0$ such that

Figures (4)

  • Figure 1: Numerical results on the synthetic dataset with different network graphs.
  • Figure 2: Results on the synthetic dataset with ER $p=0.6$.
  • Figure 3: Results on Mnist dataset with ER $p=0.3$.
  • Figure 4: Numerical results for the decentralized LRMC problem with the Ring graph.

Theorems & Definitions (24)

  • Definition 2.1: deng2023decentralized
  • Lemma 3.2: deng2023decentralized, Lemma 2
  • Definition 3.5: stochastic first-order oracle
  • Theorem 3.6: Consensus error
  • Theorem 3.7: Optimality error
  • Corollary 3.8
  • Lemma 4.1
  • Lemma 4.2
  • Proof 1: Proof of Theorem \ref{['lem:consensus']}
  • Lemma 4.3
  • ...and 14 more