Table of Contents
Fetching ...

Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation

Jinsheng Yuan, Zhuangkun Wei, Weisi Guo

TL;DR

This work tackles the challenge of heterogeneous client bit-precision in Over-The-Air Federated Learning (OTA-FL) by introducing a mixed-precision OTA-FL framework with multi-precision over-the-air aggregation. It presents a channel-aware transmission design, including channel estimation at clients, uplink precoding, and downlink re-quantization, along with a quantization strategy suitable for ultra-low precision. The authors provide an energy-consumption model and validate the approach through a real-world case study on GTSRB using ResNet-50, showing substantial energy savings (e.g., up to ~65% for low-precision clients) and improved or preserved server and client accuracy (around 97% top-1 after 100 rounds). The results demonstrate the practicality of green, heterogeneous edge computing environments and the potential of AxC-based OTA-FL to accommodate diverse hardware while maintaining learning performance.

Abstract

Over-the-Air Federated Learning (OTA-FL) is a privacy-preserving distributed learning mechanism, by aggregating updates in the electromagnetic channel rather than at the server. A critical research gap in existing OTA-FL research is the assumption of homogeneous client computational bit precision. While in real world application, clients with varying hardware resources may exploit approximate computing (AxC) to operate at different bit precisions optimized for energy and computational efficiency. And model updates of various precisions amongst clients poses an open challenge for OTA-FL, as it is incompatible in the wireless modulation superposition. Here, we propose an mixed-precision OTA-FL framework of clients with multiple bit precisions, demonstrating the following innovations: (i) the superior trade-off for both server and clients within the constraints of varying edge computing capabilities, energy efficiency, and learning accuracy requirements comparing to homogeneous client bit precision, and (ii) a multi-precision gradient modulation scheme to ensure compatibility with OTA aggregation and eliminate the overheads of precision conversion. Through case study with real world data, we validate our modulation scheme that enables AxC based mixed-precision OTA-FL. In comparison to homogeneous standard precision of 32-bit and 16-bit, our framework presents more than 10% in 4-bit ultra low precision client performance and over 65%and 13% of energy savings respectively. This demonstrates the great potential of our mixed-precision OTA-FL approach in heterogeneous edge computing environments.

Mixed-Precision Federated Learning via Multi-Precision Over-The-Air Aggregation

TL;DR

This work tackles the challenge of heterogeneous client bit-precision in Over-The-Air Federated Learning (OTA-FL) by introducing a mixed-precision OTA-FL framework with multi-precision over-the-air aggregation. It presents a channel-aware transmission design, including channel estimation at clients, uplink precoding, and downlink re-quantization, along with a quantization strategy suitable for ultra-low precision. The authors provide an energy-consumption model and validate the approach through a real-world case study on GTSRB using ResNet-50, showing substantial energy savings (e.g., up to ~65% for low-precision clients) and improved or preserved server and client accuracy (around 97% top-1 after 100 rounds). The results demonstrate the practicality of green, heterogeneous edge computing environments and the potential of AxC-based OTA-FL to accommodate diverse hardware while maintaining learning performance.

Abstract

Over-the-Air Federated Learning (OTA-FL) is a privacy-preserving distributed learning mechanism, by aggregating updates in the electromagnetic channel rather than at the server. A critical research gap in existing OTA-FL research is the assumption of homogeneous client computational bit precision. While in real world application, clients with varying hardware resources may exploit approximate computing (AxC) to operate at different bit precisions optimized for energy and computational efficiency. And model updates of various precisions amongst clients poses an open challenge for OTA-FL, as it is incompatible in the wireless modulation superposition. Here, we propose an mixed-precision OTA-FL framework of clients with multiple bit precisions, demonstrating the following innovations: (i) the superior trade-off for both server and clients within the constraints of varying edge computing capabilities, energy efficiency, and learning accuracy requirements comparing to homogeneous client bit precision, and (ii) a multi-precision gradient modulation scheme to ensure compatibility with OTA aggregation and eliminate the overheads of precision conversion. Through case study with real world data, we validate our modulation scheme that enables AxC based mixed-precision OTA-FL. In comparison to homogeneous standard precision of 32-bit and 16-bit, our framework presents more than 10% in 4-bit ultra low precision client performance and over 65%and 13% of energy savings respectively. This demonstrates the great potential of our mixed-precision OTA-FL approach in heterogeneous edge computing environments.
Paper Structure (23 sections, 9 equations, 4 figures, 2 tables, 2 algorithms)

This paper contains 23 sections, 9 equations, 4 figures, 2 tables, 2 algorithms.

Figures (4)

  • Figure 1: End-to-end Federated Learning (FL) system moves from (a) FL, to (b) privacy preserving OTA-FL, to (c) energy efficient AxC OTA-FL. The research challenge from (b) to (c) is to achieve heterogeneous OTA weight aggregation to cater for mixed bit precision IoT edge computation.
  • Figure 2: Structure of our proposed Approximate Computing (AxC) based OTA-FL framework of multi-precision clients and unified multi-precision modulation scheme. The intelligent transport validation case study here is a multi-precision federated supervised traffic sign recognizer. (a) Clients operate end-to-end at their designated computation precisions with their own data and labels. (b) Multi-precision OTA aggregation process (uplink). (c) Downlink, re-quantization and client model update.
  • Figure 3: Training accuracy in 100 communication rounds, with ImageNet pre-trained weights initialization, scheme $[4, 4, 4]$ (denoted by brown 'X'), and scheme $[12, 4, 4]$ (denoted by red upright triangle) converge significantly slower than other schemes, even when the latter one has a better random start.
  • Figure 4: Trade-offs between accuracy of model quantized to 4-bit and energy savings in comparison to homogeneous 32-bit and 16-bit clients, schemes near bottom right corner presents superior trade-off towards accuracy.