Table of Contents
Fetching ...

FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge Offloading

Mulei Ma, Chenyu Gong, Liekang Zeng, Yang Yang, Liantao Wu

TL;DR

This work tackles data heterogeneity in edge federated learning by integrating computation offloading to reshape local data distributions, governed by KL-divergence-based offloading and data-aware service matching. It introduces FlocOff, a two-subproblem framework comprising MKL-CO (Minimizes the KL Divergence via Computation Offloading) and MCC-RA (Minimizes the Communication Cost through Resource Allocation), and provides theoretical analysis linking data distribution, gradient perturbations, and convergence. The authors prove that decoupling the original non-convex cost into these subproblems preserves tractability and can approach optimality, supported by a KL-divergence-driven offloading algorithm and a quasi-convex power allocation via a guaranteed convergence method. Empirical results on MNIST and CIFAR-10 demonstrate that FlocOff yields up to 14.3–32.7% accuracy gains and substantial reductions in communication cost (up to 82.4%), validating its practical impact for scalable, privacy-preserving edge learning with heterogeneous data.

Abstract

Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity. Towards that, traditional approaches typically utilize adaptive mechanisms, which may suffer from scalability issues, increased computational overhead, and limited adaptability to diverse edge environments. To address that, this paper instead leverages the observation that the computation offloading involves inherent functionalities such as node matching and service correlation to achieve data reshaping and proposes Federated learning based on computing Offloading (FlocOff) framework, to address data heterogeneity and resource-constrained challenges. Specifically, FlocOff formulates the FL process with Non-IID data in edge scenarios and derives rigorous analysis on the impact of imbalanced data distribution. Based on this, FlocOff decouples the optimization in two steps, namely : (1) Minimizes the Kullback-Leibler (KL) divergence via Computation Offloading scheduling (MKL-CO); (2) Minimizes the Communication Cost through Resource Allocation (MCC-RA). Extensive experimental results demonstrate that the proposed FlocOff effectively improves model convergence and accuracy by 14.3\%-32.7\% while reducing data heterogeneity under various data distributions.

FlocOff: Data Heterogeneity Resilient Federated Learning with Communication-Efficient Edge Offloading

TL;DR

This work tackles data heterogeneity in edge federated learning by integrating computation offloading to reshape local data distributions, governed by KL-divergence-based offloading and data-aware service matching. It introduces FlocOff, a two-subproblem framework comprising MKL-CO (Minimizes the KL Divergence via Computation Offloading) and MCC-RA (Minimizes the Communication Cost through Resource Allocation), and provides theoretical analysis linking data distribution, gradient perturbations, and convergence. The authors prove that decoupling the original non-convex cost into these subproblems preserves tractability and can approach optimality, supported by a KL-divergence-driven offloading algorithm and a quasi-convex power allocation via a guaranteed convergence method. Empirical results on MNIST and CIFAR-10 demonstrate that FlocOff yields up to 14.3–32.7% accuracy gains and substantial reductions in communication cost (up to 82.4%), validating its practical impact for scalable, privacy-preserving edge learning with heterogeneous data.

Abstract

Federated Learning (FL) has emerged as a fundamental learning paradigm to harness massive data scattered at geo-distributed edge devices in a privacy-preserving way. Given the heterogeneous deployment of edge devices, however, their data are usually Non-IID, introducing significant challenges to FL including degraded training accuracy, intensive communication costs, and high computing complexity. Towards that, traditional approaches typically utilize adaptive mechanisms, which may suffer from scalability issues, increased computational overhead, and limited adaptability to diverse edge environments. To address that, this paper instead leverages the observation that the computation offloading involves inherent functionalities such as node matching and service correlation to achieve data reshaping and proposes Federated learning based on computing Offloading (FlocOff) framework, to address data heterogeneity and resource-constrained challenges. Specifically, FlocOff formulates the FL process with Non-IID data in edge scenarios and derives rigorous analysis on the impact of imbalanced data distribution. Based on this, FlocOff decouples the optimization in two steps, namely : (1) Minimizes the Kullback-Leibler (KL) divergence via Computation Offloading scheduling (MKL-CO); (2) Minimizes the Communication Cost through Resource Allocation (MCC-RA). Extensive experimental results demonstrate that the proposed FlocOff effectively improves model convergence and accuracy by 14.3\%-32.7\% while reducing data heterogeneity under various data distributions.
Paper Structure (24 sections, 1 theorem, 48 equations, 11 figures, 2 tables, 2 algorithms)

This paper contains 24 sections, 1 theorem, 48 equations, 11 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

$w$ and $v$ are model parameters trained based on Non-IID and IID data, respectively. Assume that in the weight aggregation period t ($t \textgreater 1$), the update frequencies of $w$ and $v$ are synchronized. The following formula is established:

Figures (11)

  • Figure 1: Illustration of the proposed Federated Learning framework based on edge computation offloading.
  • Figure 2: Impairment of model training by data heterogeneity.
  • Figure 3: Classification of federated learning efficiency improvement and various approaches to address data heterogeneity. Our solution employs computing offloading to address issues of data heterogeneity.
  • Figure 4: Process of FlocOff algorithm based on computation offloading scheduling.
  • Figure 5: Impact of Server threshold.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Definition 1
  • Theorem 1
  • Definition 2