Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data

Seyoung Ahn; Soohyeong Kim; Yongseok Kwon; Joohan Park; Jiseung Youn; Sunghyun Cho

Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data

Seyoung Ahn, Soohyeong Kim, Yongseok Kwon, Joohan Park, Jiseung Youn, Sunghyun Cho

TL;DR

This work tackles the degradation of global model performance in federated learning caused by non-IID data by introducing FedDif, a diffusion-based mechanism that propagates local models across neighboring users to learn diverse data distributions prior to aggregation. The diffusion process is paired with an auction-theoretic scheduling method to trade learning gains against communication costs, with a formal IID-distance measure based on Wasserstein distance guiding the optimization. Theoretical results link diffusion to reduced weight divergence and potential convergence to IID-like performance, while experiments on CIFAR-100, EMNIST, and CIFAR-10 show significant accuracy gains and notable communication savings relative to state-of-the-art baselines. The approach offers a practical path to robust, privacy-preserving FL in 6G-type wireless networks and is complemented by open-source implementations and future directions for extending diffusion concepts to vertical FL and more complex network settings.

Abstract

In 6G mobile communication systems, various AI-based network functions and applications have been standardized. Federated learning (FL) is adopted as the core learning architecture for 6G systems to avoid privacy leakage from mobile user data. However, in FL, users with non-independent and identically distributed (non-IID) datasets can deteriorate the performance of the global model because the convergence direction of the gradient for each dataset is different, thereby inducing a weight divergence problem. To address this problem, we propose a novel diffusion strategy for machine learning (ML) models (FedDif) to maximize the performance of the global model with non-IID data. FedDif enables the local model to learn different distributions before parameter aggregation by passing the local models through users via device-to-device communication. Furthermore, we theoretically demonstrate that FedDif can circumvent the weight-divergence problem. Based on this theory, we propose a communication-efficient diffusion strategy for ML models that can determine the trade-off between learning performance and communication cost using auction theory. The experimental results show that FedDif improves the top-1 test accuracy by up to 34.89\% and reduces communication costs by 14.6% to a maximum of 63.49%.

Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data

TL;DR

Abstract

Paper Structure (25 sections, 5 theorems, 66 equations, 7 figures, 1 table, 2 algorithms)

This paper contains 25 sections, 5 theorems, 66 equations, 7 figures, 1 table, 2 algorithms.

Introduction
Related works
System model and problem formulation
System description
Representations of Data Distribution
Problem formulation
Communication-efficient diffusion strategy
Modeling the bidding price
Winner selection algorithm
Complexity analysis
Theoretical analysis
Experimental analysis
Experimental setup
Discussion on the heterogeneity of non-IID data
Comparison of FedDif with other SOTA methods
...and 10 more sections

Key Result

Proposition 1

An upper bound of weight difference can be expressed as where $a^{(m)} = 1 + \frac{\eta}{\lvert\mathcal{P}_{K}^{(m)}\rvert} \sum_{i \in \mathcal{P}_{K}^{(m)}}\lambda_{i}$.

Figures (7)

Figure 1: Overview of FedDif.
Figure 2: Test accuracy comparison by changing values of concentration parameter and minimum tolerable IID distance. Subfigures (a), (b), and (c) indicate CIFAR-100, EMNIST, and CIFAR-10 datasets, respectively.
Figure 3: Test accuracy comparison corresponding to the heterogeneity of dataset and ML tasks. Subfigures (a)-(c), (d)-(f), and (g)-(i) indicate CIFAR-100, EMNIST, and CIFAR-10 datasets, respectively.
Figure 4: Comparison of the top-1 test accuracy (%). (300 communication rounds, 100 clients)
Figure 5: Comparison of communication costs. (300 communication rounds, CIFAR-100, 100 clients)
...and 2 more figures

Theorems & Definitions (13)

Proposition 1
proof
Remark 1
Remark 2
Remark 3
Remark 4
Proposition 2
proof
Lemma 1
proof
...and 3 more

Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data

TL;DR

Abstract

Communication-Efficient Diffusion Strategy for Performance Improvement of Federated Learning with Non-IID Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (13)