DFML: Decentralized Federated Mutual Learning
Yasser H. Khalil, Amir H. Estiri, Mahdi Beitollahi, Nader Asadi, Sobhan Hemati, Xu Li, Guojun Zhang, Xi Chen
TL;DR
DFML tackles the challenges of decentralized federated learning by enabling serverless mutual learning among heterogeneous clients without relying on public data. It introduces a joint objective $L = (1-\alpha)L_{WSM} + \alpha L_{KL}$, with a cyclic schedule for $\alpha^{(t)}$ to balance supervision and distillation, and peak models $\widehat{W}_n$ to stabilize global knowledge. The approach supports nonrestrictive heterogeneity and demonstrates superior convergence speed and global accuracy across IID and non-IID settings, outperforming decentralized baselines on multiple datasets and architectures. This work offers a scalable, privacy-preserving alternative for real-world deployments where central servers are impractical or undesirable and data/model heterogeneity is intrinsic.
Abstract
In the realm of real-world devices, centralized servers in Federated Learning (FL) present challenges including communication bottlenecks and susceptibility to a single point of failure. Additionally, contemporary devices inherently exhibit model and data heterogeneity. Existing work lacks a Decentralized FL (DFL) framework capable of accommodating such heterogeneity without imposing architectural restrictions or assuming the availability of public data. To address these issues, we propose a Decentralized Federated Mutual Learning (DFML) framework that is serverless, supports nonrestrictive heterogeneous models, and avoids reliance on public data. DFML effectively handles model and data heterogeneity through mutual learning, which distills knowledge between clients, and cyclically varying the amount of supervision and distillation signals. Extensive experimental results demonstrate consistent effectiveness of DFML in both convergence speed and global accuracy, outperforming prevalent baselines under various conditions. For example, with the CIFAR-100 dataset and 50 clients, DFML achieves a substantial increase of +17.20% and +19.95% in global accuracy under Independent and Identically Distributed (IID) and non-IID data shifts, respectively.
