Extra Clients at No Extra Cost: Overcome Data Heterogeneity in Federated Learning with Filter Decomposition
Wei Chen, Qiang Qiu
TL;DR
This work tackles data heterogeneity in federated learning by introducing filter decomposition of convolutional layers into filter atoms $\boldsymbol{D}$ and atom coefficients $\boldsymbol{\alpha}$, enabling many latent local model variants without extra cost. Global aggregation becomes a reconstruction step after averaging $\boldsymbol{\alpha}$ and $\boldsymbol{D}$ separately, producing a global model $\boldsymbol{\theta}_{\boldsymbol{\phi}} = \boldsymbol{\alpha} \times \mathbf{D}$ that effectively expands the ensemble of participating clients. The authors provide variance-reduction and convergence analyses and demonstrate consistent accuracy gains across FL baselines on CIFAR-10/100 and Tiny-ImageNet, along with improved personalization and communication efficiency via a fast/slow update scheme. The approach is straightforward to integrate with existing FL methods and offers flexible personalization and communication schedules, with notable improvements especially in challenging datasets like Tiny-ImageNet.
Abstract
Data heterogeneity is one of the major challenges in federated learning (FL), which results in substantial client variance and slow convergence. In this study, we propose a novel solution: decomposing a convolutional filter in FL into a linear combination of filter subspace elements, i.e., filter atoms. This simple technique transforms global filter aggregation in FL into aggregating filter atoms and their atom coefficients. The key advantage here involves mathematically generating numerous cross-terms by expanding the product of two weighted sums from filter atom and atom coefficient. These cross-terms effectively emulate many additional latent clients, significantly reducing model variance, which is validated by our theoretical analysis and empirical observation. Furthermore, our method permits different training schemes for filter atoms and atom coefficients for highly adaptive model personalization and communication efficiency. Empirical results on benchmark datasets demonstrate that our filter decomposition technique substantially improves the accuracy of FL methods, confirming its efficacy in addressing data heterogeneity.
