Biased Over-the-Air Federated Learning under Wireless Heterogeneity

Muhammad Faraz Ul Abrar; Nicolò Michelusi

Biased Over-the-Air Federated Learning under Wireless Heterogeneity

Muhammad Faraz Ul Abrar, Nicolò Michelusi

TL;DR

This work tackles OTA-FL under wireless heterogeneity by deriving a convergence bound that explicitly separates model bias from update variance. It shows that forcing zero bias can be suboptimal when devices have different average path losses, and it proposes two practical pre-scaler designs—minimum noise variance and minimum-variance zero-bias—that balance bias and variance. The analysis reveals how the participation weights $p_m$ shape convergence through the bias term, while the pre-scaler choices control transmission and noise variances. Numerical results on MNIST demonstrate substantial gains from variance-focused biased pre-scalers, enabling faster convergence and higher final accuracy in heterogeneous wireless environments.

Abstract

Recently, Over-the-Air (OTA) computation has emerged as a promising federated learning (FL) paradigm that leverages the waveform superposition properties of the wireless channel to realize fast model updates. Prior work focused on the OTA device ``pre-scaler" design under \emph{homogeneous} wireless conditions, in which devices experience the same average path loss, resulting in zero-bias solutions. Yet, zero-bias designs are limited by the device with the worst average path loss and hence may perform poorly in \emph{heterogeneous} wireless settings. In this scenario, there may be a benefit in designing \emph{biased} solutions, in exchange for a lower variance in the model updates. To optimize this trade-off, we study the design of OTA device pre-scalers by focusing on the OTA-FL convergence. We derive an upper bound on the model ``optimality error", which explicitly captures the effect of bias and variance in terms of the choice of the pre-scalers. Based on this bound, we identify two solutions of interest: minimum noise variance, and minimum noise variance zero-bias solutions. Numerical evaluations show that using OTA device pre-scalers that minimize the variance of FL updates, while allowing a small bias, can provide high gains over existing schemes.

Biased Over-the-Air Federated Learning under Wireless Heterogeneity

TL;DR

shape convergence through the bias term, while the pre-scaler choices control transmission and noise variances. Numerical results on MNIST demonstrate substantial gains from variance-focused biased pre-scalers, enabling faster convergence and higher final accuracy in heterogeneous wireless environments.

Abstract

Paper Structure (9 sections, 1 theorem, 24 equations, 2 figures)

This paper contains 9 sections, 1 theorem, 24 equations, 2 figures.

Introduction
System Model and over-the-air FL
Over-the-air transmission over a fading MAC
Biased Over-the-Air-FL
Convergence Analysis and pre-scaler design
Main Convergence Results
OTA pre-scalers design
Numerical Results
Conclusion

Key Result

Theorem 1

With local objective functions $f_m(\mathbf{w})$ satisfying Assumptions 1-3, and fixed learning stepsize $\eta \in [0, \frac{2}{\tilde{\mu} + \tilde{L}}]$, the optimality error given $E_0$ after $t$ FL rounds satisfies

Figures (2)

Figure 1: Illustration of OTA-FL system model
Figure 2: Comparison of various OTA-FL schemes

Theorems & Definitions (1)

Theorem 1

Biased Over-the-Air Federated Learning under Wireless Heterogeneity

TL;DR

Abstract

Biased Over-the-Air Federated Learning under Wireless Heterogeneity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (1)