Table of Contents
Fetching ...

Local adapt-then-combine algorithms for distributed nonsmooth optimization: Achieving provable communication acceleration

Luyao Guo, Xinli Shi, Wenying Xu, Jinde Cao

TL;DR

This work tackles distributed composite optimization over networks by introducing FlexATC, a unified Adapt-Then-Combine framework that employs probabilistic local updates to reduce communication. The method yields stepsizes independent of the network and the number of local updates and establishes sublinear convergence in convex settings and linear convergence in strongly convex settings, with a rate factor $\zeta$ decoupled from the objective and the topology. It also shows that communication can be skipped with probability $p$ without deteriorating the linear rate, providing provable communication acceleration for ATC-based algorithms. The theory connects numerous ATC schemes (e.g., NIDS/ED/D2, MG-ED, ATC-GT, MG-SONATA) under a single umbrella and is validated by experiments on the ijcnn1 dataset, confirming practical efficiency gains in decentralized settings.

Abstract

This paper is concerned with the distributed composite optimization problem over networks, where agents aim to minimize a sum of local smooth components and a common nonsmooth term. Leveraging the probabilistic local updates mechanism, we propose a communication-efficient Adapt-Then-Combine (ATC) framework, FlexATC, unifying numerous ATC-based distributed algorithms. Under stepsizes independent of the network topology and the number of local updates, we establish sublinear and linear convergence rates for FlexATC in convex and strongly convex settings, respectively. Remarkably, in the strong convex setting, the linear rate is decoupled from the objective functions and network topology, and FlexATC permits communication to be skipped in most iterations without any deterioration of the linear rate. In addition, the proposed unified theory demonstrates for the first time that local updates provably lead to communication acceleration for ATC-based distributed algorithms. Numerical experiments further validate the efficacy of the proposed framework and corroborate the theoretical results.

Local adapt-then-combine algorithms for distributed nonsmooth optimization: Achieving provable communication acceleration

TL;DR

This work tackles distributed composite optimization over networks by introducing FlexATC, a unified Adapt-Then-Combine framework that employs probabilistic local updates to reduce communication. The method yields stepsizes independent of the network and the number of local updates and establishes sublinear convergence in convex settings and linear convergence in strongly convex settings, with a rate factor decoupled from the objective and the topology. It also shows that communication can be skipped with probability without deteriorating the linear rate, providing provable communication acceleration for ATC-based algorithms. The theory connects numerous ATC schemes (e.g., NIDS/ED/D2, MG-ED, ATC-GT, MG-SONATA) under a single umbrella and is validated by experiments on the ijcnn1 dataset, confirming practical efficiency gains in decentralized settings.

Abstract

This paper is concerned with the distributed composite optimization problem over networks, where agents aim to minimize a sum of local smooth components and a common nonsmooth term. Leveraging the probabilistic local updates mechanism, we propose a communication-efficient Adapt-Then-Combine (ATC) framework, FlexATC, unifying numerous ATC-based distributed algorithms. Under stepsizes independent of the network topology and the number of local updates, we establish sublinear and linear convergence rates for FlexATC in convex and strongly convex settings, respectively. Remarkably, in the strong convex setting, the linear rate is decoupled from the objective functions and network topology, and FlexATC permits communication to be skipped in most iterations without any deterioration of the linear rate. In addition, the proposed unified theory demonstrates for the first time that local updates provably lead to communication acceleration for ATC-based distributed algorithms. Numerical experiments further validate the efficacy of the proposed framework and corroborate the theoretical results.
Paper Structure (11 sections, 4 theorems, 50 equations, 1 figure, 3 tables, 1 algorithm)

This paper contains 11 sections, 4 theorems, 50 equations, 1 figure, 3 tables, 1 algorithm.

Key Result

Lemma 1

Under Assumptions ASS1, MixingMatrix-1, and MixingMatrix-2, if the point $(\bm{x}^{\star},\bm{w}^{\star},\bm{u}^\star)$ satisfies that then $\bm{x}^{\star}=\bm{1}_n\otimes x^\star$ with $x^{\star}\in \mathbb{R}^d$ solving problem EQ:Problem1.

Figures (1)

  • Figure 1: The numerical results over ijcnn1 dataset for communication acceleration are shown, with the relative error $\|\bm{x}^k-\bm{x}^\star\|/\|\bm{x}^\star\|$ plotted against the number of iteration and communication rounds.

Theorems & Definitions (4)

  • Lemma 1
  • Lemma 2
  • Theorem 1
  • Theorem 2