Table of Contents
Fetching ...

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

Yutong He, Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan

TL;DR

This work establishes fundamental lower bounds for distributed stochastic optimization under two broad compression families, unbiased and contractive, across strongly-convex, generally-convex, and non-convex settings. It then introduces NEOLITHIC, an accelerated algorithm that nearly matches these bounds (up to logarithmic factors) by combining Nesterov acceleration, gradient accumulation, and a novel multi-step compression (MSC) module. Theoretical results are complemented by experiments showing NEOLITHIC’s competitive convergence and robustness to data heterogeneity and noise. The findings clarify the limitations of existing compressors and motivate exploration of new compressor properties to surpass current limits, with practical implications for scalable distributed learning systems.

Abstract

Communication compression is an essential strategy for alleviating communication overhead by reducing the volume of information exchanged between computing nodes in large-scale distributed stochastic optimization. Although numerous algorithms with convergence guarantees have been obtained, the optimal performance limit under communication compression remains unclear. In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression. We focus on two main types of compressors, unbiased and contractive, and address the best-possible convergence rates one can obtain with these compressors. We establish the lower bounds for the convergence rates of distributed stochastic optimization in six different settings, combining strongly-convex, generally-convex, or non-convex functions with unbiased or contractive compressor types. To bridge the gap between lower bounds and existing algorithms' rates, we propose NEOLITHIC, a nearly optimal algorithm with compression that achieves the established lower bounds up to logarithmic factors under mild conditions. Extensive experimental results support our theoretical findings. This work provides insights into the theoretical limitations of existing compressors and motivates further research into fundamentally new compressor properties.

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

TL;DR

This work establishes fundamental lower bounds for distributed stochastic optimization under two broad compression families, unbiased and contractive, across strongly-convex, generally-convex, and non-convex settings. It then introduces NEOLITHIC, an accelerated algorithm that nearly matches these bounds (up to logarithmic factors) by combining Nesterov acceleration, gradient accumulation, and a novel multi-step compression (MSC) module. Theoretical results are complemented by experiments showing NEOLITHIC’s competitive convergence and robustness to data heterogeneity and noise. The findings clarify the limitations of existing compressors and motivate exploration of new compressor properties to surpass current limits, with practical implications for scalable distributed learning systems.

Abstract

Communication compression is an essential strategy for alleviating communication overhead by reducing the volume of information exchanged between computing nodes in large-scale distributed stochastic optimization. Although numerous algorithms with convergence guarantees have been obtained, the optimal performance limit under communication compression remains unclear. In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression. We focus on two main types of compressors, unbiased and contractive, and address the best-possible convergence rates one can obtain with these compressors. We establish the lower bounds for the convergence rates of distributed stochastic optimization in six different settings, combining strongly-convex, generally-convex, or non-convex functions with unbiased or contractive compressor types. To bridge the gap between lower bounds and existing algorithms' rates, we propose NEOLITHIC, a nearly optimal algorithm with compression that achieves the established lower bounds up to logarithmic factors under mild conditions. Extensive experimental results support our theoretical findings. This work provides insights into the theoretical limitations of existing compressors and motivates further research into fundamentally new compressor properties.
Paper Structure (44 sections, 20 theorems, 147 equations, 3 figures, 2 tables, 3 algorithms)

This paper contains 44 sections, 20 theorems, 147 equations, 3 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

For any $L\ge\mu\ge0$, $n\ge2$, $\omega\ge0$, and $\sigma\ge0$, the following results hold (proof is in Appendix app:lower-bounds).

Figures (3)

  • Figure 1: Convergence results of various algorithms on distributed least square problem. The $y$-axis represents $f-f^\star$ (dB) and the $x$-axis indicates the total communication rounds (in units of thousands). All curves are averaged over 20 trials.
  • Figure 2: Convergence results of various algorithms on distributed logistic regression problem. The $y$-axis represents $f-f^\star$ (dB) and the $x$-axis indicates the total communication rounds (in units of thousands). All curves are averaged over 20 trials.
  • Figure 3: Best precision over $10,000$ total communication rounds for different MSC rounds and compressors, under varying data heterogeneity and gradient noise scales. All curves are averaged over 20 trails.

Theorems & Definitions (31)

  • Definition 1: Algorithm class
  • Theorem 1: Unbiased compressor
  • Lemma 1: Compressor relation
  • Theorem 2
  • Lemma 2: MSC property
  • Remark 1: Extension to bidirectional compression
  • Theorem 3: Strongly-convex scenario
  • Remark 2: total number of iterations
  • Remark 3: Tightness of the lower bounds
  • Remark 4: Performance of NEOLITHIC
  • ...and 21 more