A Variance-Reduced Stochastic Gradient Tracking Algorithm for Decentralized Optimization with Orthogonality Constraints
Lei Wang, Xin Liu
TL;DR
The paper tackles decentralized optimization on the Stiefel manifold with orthogonality constraints, proposing VRSGT to simultaneously reduce sampling and communication costs. It introduces augmented Lagrangian estimation and gradient approximation to handle constraints and compute descent directions efficiently, achieving an $O(1/k)$ convergence rate to a stationary point. The authors provide thorough convergence analysis under mild local smoothness assumptions and demonstrate strong empirical performance on decentralized PCA and DPCP tasks, including autonomous driving scenarios. The work offers practical, scalable tools for distributed learning with nonconvex orthogonality constraints and highlights potential for broader applicability in deep learning and real-time sensing systems.
Abstract
Decentralized optimization with orthogonality constraints is found widely in scientific computing and data science. Since the orthogonality constraints are nonconvex, it is quite challenging to design efficient algorithms. Existing approaches leverage the geometric tools from Riemannian optimization to solve this problem at the cost of high sample and communication complexities. To relieve this difficulty, based on two novel techniques that can waive the orthogonality constraints, we propose a variance-reduced stochastic gradient tracking (VRSGT) algorithm with the convergence rate of $O(1 / k)$ to a stationary point. To the best of our knowledge, VRSGT is the first algorithm for decentralized optimization with orthogonality constraints that reduces both sampling and communication complexities simultaneously. In the numerical experiments, VRSGT has a promising performance in a real-world autonomous driving application.
