Table of Contents
Fetching ...

Scalable Hierarchical Over-the-Air Federated Learning

Seyed Mohammad Azimi-Abarghouyi, Viktoria Fodor

TL;DR

This work tackles scalability, interference, and data heterogeneity in hierarchical federated learning over wireless networks by introducing MultiAirFed, a two-level learning method that combines intra-cluster gradient aggregation with inter-cluster model-parameter aggregation. A scalable clustered over-the-air uplink and a bandwidth-limited analog downlink transmission scheme are proposed to operate on a single wireless resource block, with optimized receiver normalizing factors to mitigate distortion. The authors model device locations with a Poisson cluster process, derive a tractable convergence bound for multi-cluster networks, and quantify uplink/downlink interference effects. Experimental results on MNIST and CIFAR-10 demonstrate that MultiAirFed outperforms conventional HierFed under realistic interference and non-i.i.d. data, validating the approach’s potential for large-scale edge learning deployments.

Abstract

When implementing hierarchical federated learning over wireless networks, scalability assurance and the ability to handle both interference and device data heterogeneity are crucial. This work introduces a new two-level learning method designed to address these challenges, along with a scalable over-the-air aggregation scheme for the uplink and a bandwidth-limited broadcast scheme for the downlink that efficiently use a single wireless resource. To provide resistance against data heterogeneity, we employ gradient aggregations. Meanwhile, the impact of uplink and downlink interference is minimized through optimized receiver normalizing factors. We present a comprehensive mathematical approach to derive the convergence bound for the proposed algorithm, applicable to a multi-cluster wireless network encompassing any count of collaborating clusters, and provide special cases and design remarks. As a key step to enable a tractable analysis, we develop a spatial model for the setup by modeling devices as a Poisson cluster process over the edge servers and rigorously quantify uplink and downlink error terms due to the interference. Finally, we show that despite the interference and data heterogeneity, the proposed algorithm not only achieves high learning accuracy for a variety of parameters but also significantly outperforms the conventional hierarchical learning algorithm.

Scalable Hierarchical Over-the-Air Federated Learning

TL;DR

This work tackles scalability, interference, and data heterogeneity in hierarchical federated learning over wireless networks by introducing MultiAirFed, a two-level learning method that combines intra-cluster gradient aggregation with inter-cluster model-parameter aggregation. A scalable clustered over-the-air uplink and a bandwidth-limited analog downlink transmission scheme are proposed to operate on a single wireless resource block, with optimized receiver normalizing factors to mitigate distortion. The authors model device locations with a Poisson cluster process, derive a tractable convergence bound for multi-cluster networks, and quantify uplink/downlink interference effects. Experimental results on MNIST and CIFAR-10 demonstrate that MultiAirFed outperforms conventional HierFed under realistic interference and non-i.i.d. data, validating the approach’s potential for large-scale edge learning deployments.

Abstract

When implementing hierarchical federated learning over wireless networks, scalability assurance and the ability to handle both interference and device data heterogeneity are crucial. This work introduces a new two-level learning method designed to address these challenges, along with a scalable over-the-air aggregation scheme for the uplink and a bandwidth-limited broadcast scheme for the downlink that efficiently use a single wireless resource. To provide resistance against data heterogeneity, we employ gradient aggregations. Meanwhile, the impact of uplink and downlink interference is minimized through optimized receiver normalizing factors. We present a comprehensive mathematical approach to derive the convergence bound for the proposed algorithm, applicable to a multi-cluster wireless network encompassing any count of collaborating clusters, and provide special cases and design remarks. As a key step to enable a tractable analysis, we develop a spatial model for the setup by modeling devices as a Poisson cluster process over the edge servers and rigorously quantify uplink and downlink error terms due to the interference. Finally, we show that despite the interference and data heterogeneity, the proposed algorithm not only achieves high learning accuracy for a variety of parameters but also significantly outperforms the conventional hierarchical learning algorithm.
Paper Structure (15 sections, 3 theorems, 81 equations, 11 figures, 1 table, 1 algorithm)

This paper contains 15 sections, 3 theorems, 81 equations, 11 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Consider a fixed learning rate $\mu_t = \mu$ satisfying Then, the following optimality gap holds for the local learning model of any reference device.

Figures (11)

  • Figure 1: Representation of the FL system with three clusters, and the message exchange in the MultiAirFed method.
  • Figure 2: Test accuracy as a function of global iterations $t$ (i.i.d.)
  • Figure 3: Latency in seconds as a function of intra-cluster iterations $\tau$ (i.i.d.)
  • Figure 4: Test accuracy as a function of global iterations $t$ (i.i.d.)
  • Figure 5: Test accuracy as a function of global iterations $t$ (non-i.i.d.)
  • ...and 6 more figures

Theorems & Definitions (15)

  • Theorem 1
  • Remark 1
  • Remark 2
  • Remark 3
  • Remark 4
  • Remark 5
  • Remark 6
  • Remark 7
  • Remark 8
  • Remark 9
  • ...and 5 more