Towards Understanding Generalization and Stability Gaps between Centralized and Decentralized Federated Learning
Yan Sun, Li Shen, Dacheng Tao
TL;DR
The paper analyzes generalization and stability gaps between Centralized Federated Learning (CFL) and Decentralized Federated Learning (DFL) on smooth non-convex objectives. It develops a uniform stability framework that does not assume bounded full gradients and derives explicit excess-risk bounds for FedAvg (CFL) and D-FedAvg (DFL), showing CFL generalizes no worse than DFL and that partial participation can be optimal for CFL. It further identifies a minimal topology requirement for DFL to avoid performance collapse as the client count grows and characterizes how topology and participation shape generalization. Extensive experiments on CIFAR-10 with Dirichlet-heterogeneous data validate the theory, demonstrating CFL’s superior test accuracy at larger scales and clarifying scenarios where DFL may be preferable due to communication constraints.
Abstract
As two mainstream frameworks in federated learning (FL), both centralized and decentralized approaches have shown great application value in practical scenarios. However, existing studies do not provide sufficient evidence and clear guidance for analysis of which performs better in the FL community. Although decentralized methods have been proven to approach the comparable convergence of centralized with less communication, their test performance always falls short of expectations in empirical studies. To comprehensively and fairly compare their efficiency gaps in FL, in this paper, we explore their stability and generalization efficiency. Specifically, we prove that on the general smooth non-convex objectives, 1) centralized FL (CFL) always generalizes better than decentralized FL (DFL); 2) CFL achieves the best performance via adopting partial participation instead of full participation; and, 3) there is a necessary requirement for the topology in DFL to avoid performance collapse as the training scale increases. We also conduct extensive experiments on several common setups in FL to validate that our theoretical analysis is consistent with experimental phenomena and contextually valid in several general and practical scenarios.
