Balancing Similarity and Complementarity for Federated Learning
Kunda Yan, Sen Cui, Abudukelimu Wuerkaixi, Jingfeng Zhang, Bo Han, Gang Niu, Masashi Sugiyama, Changshui Zhang
TL;DR
The paper addresses non-i.i.d. data challenges in Federated Learning by arguing that optimal cooperation is not achieved by pursuing maximum model similarity alone. It introduces FedSaC, a two-stage framework that learns a cooperation network by jointly optimizing a weighted mix of similarity and feature complementarity, the latter quantified via principal angles between local data subspaces obtained from SVD. The method uses a server-side optimization to derive the adjacency matrix and broadcasts aggregated models, followed by client-side refinement that respects the server-derived cooperation while fitting local data. Empirical results on unimodal (CIFAR-10/100) and multimodal (CUB200-2011) benchmarks show that FedSaC consistently surpasses state-of-the-art FL methods across various heterogeneity regimes, validating the importance of exploiting data complementarity in cooperative learning.
Abstract
In mobile and IoT systems, Federated Learning (FL) is increasingly important for effectively using data while maintaining user privacy. One key challenge in FL is managing statistical heterogeneity, such as non-i.i.d. data, arising from numerous clients and diverse data sources. This requires strategic cooperation, often with clients having similar characteristics. However, we are interested in a fundamental question: does achieving optimal cooperation necessarily entail cooperating with the most similar clients? Typically, significant model performance improvements are often realized not by partnering with the most similar models, but through leveraging complementary data. Our theoretical and empirical analyses suggest that optimal cooperation is achieved by enhancing complementarity in feature distribution while restricting the disparity in the correlation between features and targets. Accordingly, we introduce a novel framework, \texttt{FedSaC}, which balances similarity and complementarity in FL cooperation. Our framework aims to approximate an optimal cooperation network for each client by optimizing a weighted sum of model similarity and feature complementarity. The strength of \texttt{FedSaC} lies in its adaptability to various levels of data heterogeneity and multimodal scenarios. Our comprehensive unimodal and multimodal experiments demonstrate that \texttt{FedSaC} markedly surpasses other state-of-the-art FL methods.
