Table of Contents
Fetching ...

Optimal and Efficient Algorithms for Decentralized Online Convex Optimization

Yuanyu Wan, Tong Wei, Bo Xue, Mingli Song, Lijun Zhang

TL;DR

The paper tackles decentralized online convex optimization on a network, addressing gaps between existing upper and lower bounds by introducing an accelerated gossip-based algorithm (AD-FTGL) with a blocking update that yields near-optimal regret: $\tilde{O}(n\rho^{-1/4}\sqrt{T})$ for convex and $\tilde{O}(n\rho^{-1/2}\log T)$ for strongly convex objectives. It also provides matching topology-aware lower bounds and extends the framework with a projection-free variant achieving favorable regret-communication trade-offs, including $O(nT^{3/4})$ for convex and $O(nT^{2/3}(\log T)^{1/3})$ for strongly convex cases, with nearly optimal communication rounds. The analysis hinges on improved consensus via Acc_Gossip, a blocking update mechanism, and refined spectral-gap arguments, including a tighter $O(\rho^{-1}\log n)$ bound for gossip errors. Collectively, the results demonstrate that the proposed algorithms are nearly optimal in $T$, $n$, and $\rho$ and offer practical projection-free alternatives for complex constraint sets. The study advances understanding of topology- and communication-dependent limits in D-OCO and offers tools for efficient distributed online learning under realistic communication constraints.

Abstract

We investigate decentralized online convex optimization (D-OCO), in which a set of local learners are required to minimize a sequence of global loss functions using only local computations and communications. Previous studies have established $O(n^{5/4}ρ^{-1/2}\sqrt{T})$ and ${O}(n^{3/2}ρ^{-1}\log T)$ regret bounds for convex and strongly convex functions respectively, where $n$ is the number of local learners, $ρ<1$ is the spectral gap of the communication matrix, and $T$ is the time horizon. However, there exist large gaps from the existing lower bounds, i.e., $Ω(n\sqrt{T})$ for convex functions and $Ω(n)$ for strongly convex functions. To fill these gaps, in this paper, we first develop a novel D-OCO algorithm that can respectively reduce the regret bounds for convex and strongly convex functions to $\tilde{O}(nρ^{-1/4}\sqrt{T})$ and $\tilde{O}(nρ^{-1/2}\log T)$. The primary technique is to design an online accelerated gossip strategy that enjoys a faster average consensus among local learners. Furthermore, by carefully exploiting spectral properties of a specific network topology, we enhance the lower bounds for convex and strongly convex functions to $Ω(nρ^{-1/4}\sqrt{T})$ and $Ω(nρ^{-1/2}\log T)$, respectively. These results suggest that the regret of our algorithm is nearly optimal in terms of $T$, $n$, and $ρ$ for both convex and strongly convex functions. Finally, we propose a projection-free variant of our algorithm to efficiently handle practical applications with complex constraints. Our analysis reveals that the projection-free variant can achieve ${O}(nT^{3/4})$ and ${O}(nT^{2/3}(\log T)^{1/3})$ regret bounds for convex and strongly convex functions with nearly optimal $\tilde{O}(ρ^{-1/2}\sqrt{T})$ and $\tilde{O}(ρ^{-1/2}T^{1/3}(\log T)^{2/3})$ communication rounds, respectively.

Optimal and Efficient Algorithms for Decentralized Online Convex Optimization

TL;DR

The paper tackles decentralized online convex optimization on a network, addressing gaps between existing upper and lower bounds by introducing an accelerated gossip-based algorithm (AD-FTGL) with a blocking update that yields near-optimal regret: for convex and for strongly convex objectives. It also provides matching topology-aware lower bounds and extends the framework with a projection-free variant achieving favorable regret-communication trade-offs, including for convex and for strongly convex cases, with nearly optimal communication rounds. The analysis hinges on improved consensus via Acc_Gossip, a blocking update mechanism, and refined spectral-gap arguments, including a tighter bound for gossip errors. Collectively, the results demonstrate that the proposed algorithms are nearly optimal in , , and and offer practical projection-free alternatives for complex constraint sets. The study advances understanding of topology- and communication-dependent limits in D-OCO and offers tools for efficient distributed online learning under realistic communication constraints.

Abstract

We investigate decentralized online convex optimization (D-OCO), in which a set of local learners are required to minimize a sequence of global loss functions using only local computations and communications. Previous studies have established and regret bounds for convex and strongly convex functions respectively, where is the number of local learners, is the spectral gap of the communication matrix, and is the time horizon. However, there exist large gaps from the existing lower bounds, i.e., for convex functions and for strongly convex functions. To fill these gaps, in this paper, we first develop a novel D-OCO algorithm that can respectively reduce the regret bounds for convex and strongly convex functions to and . The primary technique is to design an online accelerated gossip strategy that enjoys a faster average consensus among local learners. Furthermore, by carefully exploiting spectral properties of a specific network topology, we enhance the lower bounds for convex and strongly convex functions to and , respectively. These results suggest that the regret of our algorithm is nearly optimal in terms of , , and for both convex and strongly convex functions. Finally, we propose a projection-free variant of our algorithm to efficiently handle practical applications with complex constraints. Our analysis reveals that the projection-free variant can achieve and regret bounds for convex and strongly convex functions with nearly optimal and communication rounds, respectively.
Paper Structure (29 sections, 21 theorems, 162 equations, 2 tables, 4 algorithms)

This paper contains 29 sections, 21 theorems, 162 equations, 2 tables, 4 algorithms.

Key Result

Lemma 1

(Proposition 1 in Ye2020) Under Assumption assum5, for $L\geq 1$, the iterations of (eq-fastMix-pre) with $\theta=(1+\sqrt{1-\sigma_2^2(P)})^{-1}$ ensure that

Theorems & Definitions (24)

  • Lemma 1
  • Lemma 2
  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Remark 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Remark 2
  • ...and 14 more