A Communication-Efficient Stochastic Gradient Descent Algorithm for Distributed Nonconvex Optimization
Antai Xie, Xinlei Yi, Xiaofan Wang, Ming Cao, Xiaoqiang Ren
TL;DR
This paper proposes a distributed stochastic gradient descent algorithm, suitable for a general class of compressors, and shows that the proposed algorithm achieves the linear speedup convergence rate of $\mathcal{O}(-1/\sqrt{nT})$ for smooth nonconvex functions.
Abstract
This paper studies distributed nonconvex optimization problems with stochastic gradients for a multi-agent system, in which each agent aims to minimize the sum of all agents' cost functions by using local compressed information exchange. We propose a distributed stochastic gradient descent (SGD) algorithm, suitable for a general class of compressors. We show that the proposed algorithm achieves the linear speedup convergence rate $\mathcal{O}(1/\sqrt{nT})$ for smooth nonconvex functions, where $T$ and $n$ are the number of iterations and agents, respectively. If the global cost function additionally satisfies the Polyak--Łojasiewicz condition, the proposed algorithm can linearly converge to a neighborhood of the global optimum, regardless of whether the stochastic gradient is unbiased or not. Numerical experiments are carried out to verify the efficiency of our algorithm.
