Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu
TL;DR
This work tackles decentralized optimization over the Stiefel manifold by formulating a consensus-constrained problem and introducing DR-CGD, the first decentralized Riemannian conjugate gradient method. The key innovation is a retraction-free, vector-transport-free update that uses tangent-space projections to propagate search directions across a network, yielding lower per-iteration cost. The authors establish global convergence under a set of extended assumptions and demonstrate linear convergence of consensus error, along with practical performance gains on synthetic eigenvector problems and real data (MNIST). The approach offers a scalable, efficient solution for distributed problems with orthogonality constraints and has potential impact on large-scale eigenvalue problems and related manifold-constraint optimization tasks.
Abstract
The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than that of second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there is little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and the communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.
