Federated Mutual Learning
Tao Shen, Jie Zhang, Xinkang Jia, Fengda Zhang, Gang Huang, Pan Zhou, Kun Kuang, Fei Wu, Chao Wu
TL;DR
This work identifies data, objective, and model heterogeneity (DOM) as central challenges in federated learning and proposes Federated Mutual Learning (FML) to address them. FML equips each client with a meme (global fork) model and a personalized local model, and uses deep mutual learning to exchange knowledge via KL-based losses during local updates; meme models are aggregated to progressively refine the global model while clients retain personalized components. Empirical results on MNIST and CIFAR datasets show that FML outperforms FedAvg and FedProx under both IID and Non-IID conditions and remains effective under DOM scenarios involving data, model, and task heterogeneity. The approach preserves privacy by keeping personalized models local and demonstrates robustness to heterogeneity while revealing phenomena such as catfish effects and the benefits of dynamic balancing of learning signals.
Abstract
Federated learning (FL) enables collaboratively training deep learning models on decentralized data. However, there are three types of heterogeneities in FL setting bringing about distinctive challenges to the canonical federated learning algorithm (FedAvg). First, due to the Non-IIDness of data, the global shared model may perform worse than local models that solely trained on their private data; Second, the objective of center server and clients may be different, where center server seeks for a generalized model whereas client pursue a personalized model, and clients may run different tasks; Third, clients may need to design their customized model for various scenes and tasks; In this work, we present a novel federated learning paradigm, named Federated Mutual Leaning (FML), dealing with the three heterogeneities. FML allows clients training a generalized model collaboratively and a personalized model independently, and designing their private customized models. Thus, the Non-IIDness of data is no longer a bug but a feature that clients can be personally served better. The experiments show that FML can achieve better performance than alternatives in typical FL setting, and clients can be benefited from FML with different models and tasks.
