On graphs with finite-time consensus and their use in gradient tracking
Edward Duc Hien Nguyen, Xin Jiang, Bicheng Ying, César A. Uribe
TL;DR
This work addresses decentralized optimization of $f(x)=\frac{1}{n}\sum_{i=1}^n f_i(x)$ under deterministic sequences of graphs that satisfy finite-time consensus, enabling exact averaging after $\tau$ steps.It introduces Gradient Tracking for Finite-Time Consensus Topologies (GT-FT), which restricts gradient-tracking updates to topology sequences with finite-time averaging, and provides nonconvex convergence guarantees under stochastic gradients with a stepsize $\alpha$ in $(0,1/(4\sqrt{6}\tau^2 L)]$.The authors derive explicit weight-matrix representations for one-peer exponential graphs and for $p$-peer hyper-cuboids, show finite-time consensus for these sequences, and establish a connection to de Bruijn graphs, broadening the class of sparse, scalable topologies available for decentralized optimization.Numerical experiments demonstrate that GT-FT attains the same iteration complexity as GT with static topologies while offering substantially lower communication costs due to sparsity, illustrating practical benefits for large-scale and resource-constrained networks.
Abstract
This paper studies sequences of graphs satisfying the finite-time consensus property (i.e., iterating through such a finite sequence is equivalent to performing global or exact averaging) and their use in Gradient Tracking. We provide an explicit weight matrix representation of the studied sequences and prove their finite-time consensus property. Moreover, we incorporate the studied finite-time consensus topologies into Gradient Tracking and present a new algorithmic scheme called Gradient Tracking for Finite-Time Consensus Topologies (GT-FT). We analyze the new scheme for nonconvex problems with stochastic gradient estimates. Our analysis shows that the convergence rate of GT-FT does not depend on the heterogeneity of the agents' functions or the connectivity of any individual graph in the topology sequence. Furthermore, owing to the sparsity of the graphs, GT-FT requires lower communication costs than Gradient Tracking using the static counterpart of the topology sequence.
