Table of Contents
Fetching ...

Biased Federated Learning under Wireless Heterogeneity

Muhammad Faraz Ul Abrar, Nicolò Michelusi

TL;DR

This paper tackles federated learning over wireless networks with heterogeneous channel conditions by introducing a structured, time-invariant bias in OTA-FL and digital FL updates. It derives a unified convergence bound that explicitly links bias and variance to design parameters, and then uses a successive convex approximation framework to optimize these parameters in order to minimize the bias-variance trade-off. The approach yields significant performance gains over state-of-the-art schemes, demonstrated through extensive MNIST experiments under non-i.i.d. data distributions. The results highlight the practical value of allowing controlled bias to accelerate FL convergence in realistic wireless environments.

Abstract

Federated learning (FL) has emerged as a promising framework for distributed learning, enabling collaborative model training without sharing private data. Existing wireless FL works primarily adopt two communication strategies: (1) over-the-air (OTA) computation, which exploits wireless signal superposition for simultaneous gradient aggregation, and (2) digital communication, which allocates orthogonal resources for gradient uploads. Prior works on both schemes typically assume \emph{homogeneous} wireless conditions (equal path loss across devices) to enforce zero-bias updates or permit uncontrolled bias, resulting in suboptimal performance and high-variance model updates in \emph{heterogeneous} environments, where devices with poor channel conditions slow down convergence. This paper addresses FL over heterogeneous wireless networks by proposing novel OTA and digital FL updates that allow a structured, time-invariant model bias, thereby reducing variance in FL updates. We analyze their convergence under a unified framework and derive an upper bound on the model ``optimality error", which explicitly quantifies the effect of bias and variance in terms of design parameters. Next, to optimize this trade-off, we study a non-convex optimization problem and develop a successive convex approximation (SCA)-based framework to jointly optimize the design parameters. We perform extensive numerical evaluations with several related design variants and state-of-the-art OTA and digital FL schemes. Our results confirm that minimizing the bias-variance trade-off while allowing a structured bias provides better FL convergence performance than existing schemes.

Biased Federated Learning under Wireless Heterogeneity

TL;DR

This paper tackles federated learning over wireless networks with heterogeneous channel conditions by introducing a structured, time-invariant bias in OTA-FL and digital FL updates. It derives a unified convergence bound that explicitly links bias and variance to design parameters, and then uses a successive convex approximation framework to optimize these parameters in order to minimize the bias-variance trade-off. The approach yields significant performance gains over state-of-the-art schemes, demonstrated through extensive MNIST experiments under non-i.i.d. data distributions. The results highlight the practical value of allowing controlled bias to accelerate FL convergence in realistic wireless environments.

Abstract

Federated learning (FL) has emerged as a promising framework for distributed learning, enabling collaborative model training without sharing private data. Existing wireless FL works primarily adopt two communication strategies: (1) over-the-air (OTA) computation, which exploits wireless signal superposition for simultaneous gradient aggregation, and (2) digital communication, which allocates orthogonal resources for gradient uploads. Prior works on both schemes typically assume \emph{homogeneous} wireless conditions (equal path loss across devices) to enforce zero-bias updates or permit uncontrolled bias, resulting in suboptimal performance and high-variance model updates in \emph{heterogeneous} environments, where devices with poor channel conditions slow down convergence. This paper addresses FL over heterogeneous wireless networks by proposing novel OTA and digital FL updates that allow a structured, time-invariant model bias, thereby reducing variance in FL updates. We analyze their convergence under a unified framework and derive an upper bound on the model ``optimality error", which explicitly quantifies the effect of bias and variance in terms of design parameters. Next, to optimize this trade-off, we study a non-convex optimization problem and develop a successive convex approximation (SCA)-based framework to jointly optimize the design parameters. We perform extensive numerical evaluations with several related design variants and state-of-the-art OTA and digital FL schemes. Our results confirm that minimizing the bias-variance trade-off while allowing a structured bias provides better FL convergence performance than existing schemes.

Paper Structure

This paper contains 19 sections, 4 theorems, 45 equations, 3 figures.

Key Result

Lemma 1

Under Assumptions ass:bounded_loss_grad and ass:bounded_stochastic_grad, the gradient estimation variance satisfies $\mathrm{var}(\hat{\boldsymbol{g}}_t|\mathbf w_t)\leq\zeta^A$, with

Figures (3)

  • Figure 1: A wireless FL setup with one parameter server collaborating with $N$ devices with heterogeneous wireless conditions
  • Figure 2: (a) Sub-optimality gap vs. training time for OTA-FL variants, $G_\text{max} = 20$. Sub-optimality gap (b), normalized accuracy (c), vs. training time showing SOTA OTA-FL comparison, $G_\text{max} = 500\kappa$. Common parameters: $N = 10$, $\kappa = 0.01$, $\mu = 0.01$.
  • Figure 3: (a) Sub-optimality gap vs. training time for digital FL variants. Sub-optimality gap (b), normalized accuracy (c), vs. training time showing SOTA digital FL comparison. Common parameters: $N = 10$, $G_\text{max} = 50\kappa$, $\kappa = 0.01$, $\mu = 0.01$.

Theorems & Definitions (9)

  • Remark 1
  • Lemma 1
  • Lemma 2
  • Remark 2
  • Remark 3
  • Theorem 1
  • Lemma 3
  • proof
  • proof : Proof of Lemmas \ref{['OTA_variance_lemma']} and \ref{['Dig_variance_lemma']}