Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data
Seyoung Ahn, Soohyeong Kim, Yongseok Kwon, Joohan Park, Jiseung Youn, Sunghyun Cho
TL;DR
This work tackles the degradation of global model performance in federated learning caused by non-IID data by introducing FedDif, a diffusion-based mechanism that propagates local models across neighboring users to learn diverse data distributions prior to aggregation. The diffusion process is paired with an auction-theoretic scheduling method to trade learning gains against communication costs, with a formal IID-distance measure based on Wasserstein distance guiding the optimization. Theoretical results link diffusion to reduced weight divergence and potential convergence to IID-like performance, while experiments on CIFAR-100, EMNIST, and CIFAR-10 show significant accuracy gains and notable communication savings relative to state-of-the-art baselines. The approach offers a practical path to robust, privacy-preserving FL in 6G-type wireless networks and is complemented by open-source implementations and future directions for extending diffusion concepts to vertical FL and more complex network settings.
Abstract
In 6G mobile communication systems, various AI-based network functions and applications have been standardized. Federated learning (FL) is adopted as the core learning architecture for 6G systems to avoid privacy leakage from mobile user data. However, in FL, users with non-independent and identically distributed (non-IID) datasets can deteriorate the performance of the global model because the convergence direction of the gradient for each dataset is different, thereby inducing a weight divergence problem. To address this problem, we propose a novel diffusion strategy for machine learning (ML) models (FedDif) to maximize the performance of the global model with non-IID data. FedDif enables the local model to learn different distributions before parameter aggregation by passing the local models through users via device-to-device communication. Furthermore, we theoretically demonstrate that FedDif can circumvent the weight-divergence problem. Based on this theory, we propose a communication-efficient diffusion strategy for ML models that can determine the trade-off between learning performance and communication cost using auction theory. The experimental results show that FedDif improves the top-1 test accuracy by up to 34.89\% and reduces communication costs by 14.6% to a maximum of 63.49%.
