Tight analyses of first-order methods with error feedback
Daniel Berg Thomsen, Adrien Taylor, Aymeric Dieuleveut
TL;DR
This work addresses the communication bottleneck in distributed optimization by analyzing first-order methods with gradient compression and error feedback. Using the Performance Estimation framework, it derives tight Lyapunov-based convergence guarantees for EF and EF^21 in the single-agent, smooth μ-strongly convex setting, and compares them apples-to-apples against CGD. The authors show that EF and EF^21 have identical optimal contraction rates under deterministic compression, while CGD achieves faster rates across the tested regimes, and they provide analytically optimal step sizes. The methodology combines SDP-based worst-case analysis with symbolic regression and CAS to produce simple, provably tight Lyapunov functions, offering a transferable toolkit for assessing compressed optimization methods.
Abstract
Communication between agents often constitutes a major computational bottleneck in distributed learning. One of the most common mitigation strategies is to compress the information exchanged, thereby reducing communication overhead. To counteract the degradation in convergence associated with compressed communication, error feedback schemes -- most notably $\mathrm{EF}$ and $\mathrm{EF}^{21}$ -- were introduced. In this work, we provide a tight analysis of both of these methods. Specifically, we find the Lyapunov function that yields the best possible convergence rate for each method -- with matching lower bounds. This principled approach yields sharp performance guarantees and enables a rigorous, apples-to-apples comparison between $\mathrm{EF}$, $\mathrm{EF}^{21}$, and compressed gradient descent. Our analysis is carried out in the simplified single-agent setting, which allows for clean theoretical insights and fair comparison of the underlying mechanisms.
