Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization
Ziqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya Zhang, Masashi Sugiyama, Yanfeng Wang
TL;DR
This work addresses the misalignment between local and global sharpness in federated learning caused by data heterogeneity, which degrades generalization when using sharpness-aware minimization (SAM). It introduces FedLESAM, a lightweight method that estimates the global perturbation direction on each client by using the difference between the global models from the previous active round and the current round, enabling a single backpropagation per iteration. The authors provide theoretical results showing a slightly tighter convergence bound than FedSAM and derive an estimation-error bound for the global perturbation direction. Empirically, FedLESAM and its variants achieve superior or competitive performance across four federated benchmarks under multiple data-splitting strategies while reducing computational overhead, demonstrating practical impact for scalable, privacy-preserving learning with improved global flatness.$F(w)$ and $F_i(w)$ denote the global and client losses, while $\rho$ is the perturbation magnitude and $w^{\mathrm{old}}_i$ the previous round’s global model. FedLESAM improves alignment with centralized SAM and offers efficient, scalable performance improvements in heterogeneous FL settings.
Abstract
In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima, degenerating the performance of the resulted global model. Prevalent federated approaches incorporate sharpness-aware minimization (SAM) into local training to mitigate this problem. However, the local loss landscapes may not accurately reflect the flatness of global loss landscape in heterogeneous environments; as a result, minimizing local sharpness and calculating perturbations on client data might not align the efficacy of SAM in FL with centralized training. To overcome this challenge, we propose FedLESAM, a novel algorithm that locally estimates the direction of global perturbation on client side as the difference between global models received in the previous active and current rounds. Besides the improved quality, FedLESAM also speed up federated SAM-based approaches since it only performs once backpropagation in each iteration. Theoretically, we prove a slightly tighter bound than its original FedSAM by ensuring consistent perturbation. Empirically, we conduct comprehensive experiments on four federated benchmark datasets under three partition strategies to demonstrate the superior performance and efficiency of FedLESAM.
