Table of Contents
Fetching ...

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu

TL;DR

This work tackles decentralized optimization over the Stiefel manifold by formulating a consensus-constrained problem and introducing DR-CGD, the first decentralized Riemannian conjugate gradient method. The key innovation is a retraction-free, vector-transport-free update that uses tangent-space projections to propagate search directions across a network, yielding lower per-iteration cost. The authors establish global convergence under a set of extended assumptions and demonstrate linear convergence of consensus error, along with practical performance gains on synthetic eigenvector problems and real data (MNIST). The approach offers a scalable, efficient solution for distributed problems with orthogonality constraints and has potential impact on large-scale eigenvalue problems and related manifold-constraint optimization tasks.

Abstract

The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and the communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

TL;DR

This work tackles decentralized optimization over the Stiefel manifold by formulating a consensus-constrained problem and introducing DR-CGD, the first decentralized Riemannian conjugate gradient method. The key innovation is a retraction-free, vector-transport-free update that uses tangent-space projections to propagate search directions across a network, yielding lower per-iteration cost. The authors establish global convergence under a set of extended assumptions and demonstrate linear convergence of consensus error, along with practical performance gains on synthetic eigenvector problems and real data (MNIST). The approach offers a scalable, efficient solution for distributed problems with orthogonality constraints and has potential impact on large-scale eigenvalue problems and related manifold-constraint optimization tasks.

Abstract

The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and the communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.
Paper Structure (17 sections, 61 equations, 5 figures, 1 algorithm)

This paper contains 17 sections, 61 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Numerical results on synthetic data with different numbers of agents, eigengap $\Delta = 0.8$, Graph: Ring, $t=1$, $\hat{\alpha}=0.01$. y-axis: log-scale.
  • Figure 2: Numerical results on synthetic data with different numbers of consensus steps, eigengap $\Delta = 0.8$, Graph: Ring, $n=16$, $\hat{\alpha}=0.01$. y-axis: log-scale.
  • Figure 3: Numerical results on synthetic data with different network graphs, eigengap $\Delta = 0.8$, $t=10$, $n=16$, $\hat{\alpha}=0.05$. y-axis: log-scale.
  • Figure 4: Numerical results on MNIST data with single-step consensus, Graph: Ring, $n=20$.
  • Figure 5: An overview of the proofs.