Table of Contents
Fetching ...

Communication-Efficient Federated Learning over Wireless Channels via Gradient Sketching

Vineet Sunil Gattani, Junshan Zhang, Gautam Dasarathy

TL;DR

The proposed Federated Proximal Sketching, tailored towards band-limited wireless channels and handling data heterogeneity across edge devices, is a promising solution to tackling the above challenges of FL over wireless MACs.

Abstract

Large-scale federated learning (FL) over wireless multiple access channels (MACs) has emerged as a crucial learning paradigm with a wide range of applications. However, its widespread adoption is hindered by several major challenges, including limited bandwidth shared by many edge devices, noisy and erroneous wireless communications, and heterogeneous datasets with different distributions across edge devices. To overcome these fundamental challenges, we propose Federated Proximal Sketching (FPS), tailored towards band-limited wireless channels and handling data heterogeneity across edge devices. FPS uses a count sketch data structure to address the bandwidth bottleneck and enable efficient compression while maintaining accurate estimation of significant coordinates. Additionally, we modify the loss function in FPS such that it is equipped to deal with varying degrees of data heterogeneity. We establish the convergence of the FPS algorithm under mild technical conditions and characterize how the bias induced due to factors like data heterogeneity and noisy wireless channels play a role in the overall result. We complement the proposed theoretical framework with numerical experiments that demonstrate the stability, accuracy, and efficiency of FPS in comparison to state-of-the-art methods on both synthetic and real-world datasets. Overall, our results show that FPS is a promising solution to tackling the above challenges of FL over wireless MACs.

Communication-Efficient Federated Learning over Wireless Channels via Gradient Sketching

TL;DR

The proposed Federated Proximal Sketching, tailored towards band-limited wireless channels and handling data heterogeneity across edge devices, is a promising solution to tackling the above challenges of FL over wireless MACs.

Abstract

Large-scale federated learning (FL) over wireless multiple access channels (MACs) has emerged as a crucial learning paradigm with a wide range of applications. However, its widespread adoption is hindered by several major challenges, including limited bandwidth shared by many edge devices, noisy and erroneous wireless communications, and heterogeneous datasets with different distributions across edge devices. To overcome these fundamental challenges, we propose Federated Proximal Sketching (FPS), tailored towards band-limited wireless channels and handling data heterogeneity across edge devices. FPS uses a count sketch data structure to address the bandwidth bottleneck and enable efficient compression while maintaining accurate estimation of significant coordinates. Additionally, we modify the loss function in FPS such that it is equipped to deal with varying degrees of data heterogeneity. We establish the convergence of the FPS algorithm under mild technical conditions and characterize how the bias induced due to factors like data heterogeneity and noisy wireless channels play a role in the overall result. We complement the proposed theoretical framework with numerical experiments that demonstrate the stability, accuracy, and efficiency of FPS in comparison to state-of-the-art methods on both synthetic and real-world datasets. Overall, our results show that FPS is a promising solution to tackling the above challenges of FL over wireless MACs.

Paper Structure

This paper contains 25 sections, 3 theorems, 35 equations, 10 figures, 3 tables, 2 algorithms.

Key Result

Theorem 1

Under Assumptions as:smoothness, as:bounded-noise, as:bounded-bias, as:grad_comp and as:dissimilarity, the following result holds with probability at least $1-\delta$:

Figures (10)

  • Figure 1: Illustration of Federated Proximal Sketching (FPS) over wireless multi-access channels (MAC).
  • Figure 2: Plotting test accuracy for FPS, BLCD, FetchSGD on KDD12 dataset under noisy channel conditions. The figures correspond to different data partitioning strategies: (a) Scenario 1 (b) Scenario 2 (c) Scenario 3 (d) Scenario 4.
  • Figure 3: Plotting test accuracy for FPS, BLCD, FetchSGD on MNIST dataset under noisy channel conditions. The figures correspond to different data partitioning strategies (a) Scenario 1 (b) Scenario 2 (c) Scenario 3 (d) Scenario 4.
  • Figure 4: Plotting log of test loss computed for FPS, BLCD, FetchSGD over 5 trials under noisy channel conditions with the gradients following Assumption \ref{['as:grad_comp']} and power law degree $p = 5$. The figures correspond to different data partitioning strategies: (a) Scenario 1 (b) Scenario 2 (c) Scenario 3 (d) Scenario 4.
  • Figure 5: Plotting test accuracy for FPS, BLCD, FetchSGD on KDD10 dataset under noisy channel conditions. The figures correspond to different data partitioning strategies: (a) Scenario 1 (b) Scenario 2 (c) Scenario 3 (d) Scenario 4. We can see that FPS is stable under noisy channel conditions and consistently performs better than other competing bandlimited algorithms.
  • ...and 5 more figures

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Lemma 1
  • Lemma 2