Refined Analysis of Federated Averaging and Federated Richardson-Romberg
Paul Mangold, Alain Durmus, Aymeric Dieuleveut, Sergey Samsonov, Eric Moulines
TL;DR
The paper rethinks FedAvg by proving that its global iterates converge to a stationary distribution under constant steps and local updates, enabling precise first-order bias and variance characterizations. It separates bias into stochastic-gradient-noise and client-heterogeneity components and extends the analysis to deterministic and stochastic gradient settings. A novel Richardson-Romberg extrapolation method reduces both sources of bias without extra memory, yielding reduced communication requirements. Theory is complemented by numerical experiments on logistic regression that illustrate improved bias reduction and practical gains in homogeneous and heterogeneous data regimes. This stationary-distribution perspective offers a principled lens for designing federated optimization algorithms with controlled bias and improved efficiency.
Abstract
In this paper, we present a novel analysis of \FedAvg with constant step size, relying on the Markov property of the underlying process. We demonstrate that the global iterates of the algorithm converge to a stationary distribution and analyze its resulting bias and variance relative to the problem's solution. We provide a first-order bias expansion in both homogeneous and heterogeneous settings. Interestingly, this bias decomposes into two distinct components: one that depends solely on stochastic gradient noise and another on client heterogeneity. Finally, we introduce a new algorithm based on the Richardson-Romberg extrapolation technique to mitigate this bias.
