Table of Contents
Fetching ...

Improving Convergence for Semi-Federated Learning: An Energy-Efficient Approach by Manipulating Over-the-Air Distortion

Jingheng Zheng, Hui Tian, Wanli Ni, Yang Tian, Ping Zhang

TL;DR

Simulation results show that under different network and data distribution conditions, strategically manipulating over-the-air distortion can efficiently adjust the learning rate to improve SemiFL's convergence and energy consumption can be reduced by using the proposed algorithms.

Abstract

In this paper, we propose a hybrid learning framework that combines federated and split learning, termed semi-federated learning (SemiFL), in which over-the-air computation is utilized for gradient aggregation. A key idea is to strategically adjust the learning rate by manipulating over-the-air distortion for improving SemiFL's convergence. Specifically, we intentionally amplify amplitude distortion to increase the learning rate in the non-stable region, thereby accelerating convergence and reducing communication energy consumption. In the stable region, we suppress noise perturbation to maintain a small learning rate for improving SemiFL's final convergence. Theoretical results demonstrate the antagonistic effects of over-the-air distortion in different regions, under both independent and identically distributed (IID) and non-IID data settings. Then, we formulate two energy consumption minimization problems, one for each region, which implements a two-region mean square error threshold configuration scheme. Accordingly, we propose two resource allocation algorithms with closed-form solutions. Simulation results show that under different network and data distribution conditions, strategically manipulating over-the-air distortion can efficiently adjust the learning rate to improve SemiFL's convergence. Moreover, energy consumption can be reduced by using the proposed algorithms.

Improving Convergence for Semi-Federated Learning: An Energy-Efficient Approach by Manipulating Over-the-Air Distortion

TL;DR

Simulation results show that under different network and data distribution conditions, strategically manipulating over-the-air distortion can efficiently adjust the learning rate to improve SemiFL's convergence and energy consumption can be reduced by using the proposed algorithms.

Abstract

In this paper, we propose a hybrid learning framework that combines federated and split learning, termed semi-federated learning (SemiFL), in which over-the-air computation is utilized for gradient aggregation. A key idea is to strategically adjust the learning rate by manipulating over-the-air distortion for improving SemiFL's convergence. Specifically, we intentionally amplify amplitude distortion to increase the learning rate in the non-stable region, thereby accelerating convergence and reducing communication energy consumption. In the stable region, we suppress noise perturbation to maintain a small learning rate for improving SemiFL's final convergence. Theoretical results demonstrate the antagonistic effects of over-the-air distortion in different regions, under both independent and identically distributed (IID) and non-IID data settings. Then, we formulate two energy consumption minimization problems, one for each region, which implements a two-region mean square error threshold configuration scheme. Accordingly, we propose two resource allocation algorithms with closed-form solutions. Simulation results show that under different network and data distribution conditions, strategically manipulating over-the-air distortion can efficiently adjust the learning rate to improve SemiFL's convergence. Moreover, energy consumption can be reduced by using the proposed algorithms.

Paper Structure

This paper contains 47 sections, 7 theorems, 94 equations, 16 figures, 1 table, 3 algorithms.

Key Result

Theorem 1

Given Assumptions assumption_2 -- assumption_4, for global models ${\bf{w}}_t$, ${\bf{w}}_{t+1} \in \mathcal{R}^{\rm NS}$ and $\varepsilon\ge A/\sqrt{2 \mu}$, the expected global loss function reduction between two consecutive rounds in the non-stable region $\mathcal{R}^{\rm NS}$ is lower bounded b

Figures (16)

  • Figure 1: Illustration of over-the-air distortion accelerated SemiFL. Devices upload local gradients and intermediate outputs, leveraging intentionally amplified over-the-air distortion to accelerate convergence. The BS update shallow layers ${\bf{w}}_{t,1}$ using the aggregated gradient ${\bf{g}}^{\rm L}_{t}$, while update deep layers ${\bf{w}}_{t,2}$ by combining ${\bf{g}}^{\rm L}_{t}$ with the edge gradient ${\bf{g}}^{\rm E}_{t}$. The entire updated model is obtained by assembling the shallow layers ${\bf{w}}_{t,1}$ and the deep layers ${\bf{w}}_{t,2}$.
  • Figure 2: A workflow illustration of the proposed over-the-air distortion accelerated SemiFL framework.
  • Figure 3: An illustration of the non-stable and stable regions of SemiFL.
  • Figure 4: Learning performance comparison between SemiFL and benchmarks on the Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets, where $\epsilon_1=10$, $\epsilon_2=1$, $\epsilon_3=0.8$, and $\epsilon_4=0.01$.
  • Figure 5: Learning performance comparison of the proposed SemiFL on the Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets with different $\epsilon_1$ values, where $\epsilon_3\!=\!0.8$ and $\epsilon_4\!=\!0.01$. Note that we set $\epsilon_2\!=\!1$ when $\epsilon_1\!=\!2$ or $5$, and set $\epsilon_2\!=\!5$ when $\epsilon_1\!=\!10$. When $\epsilon_1\!=\!300$, a sufficiently large $\epsilon_2$ is adopted.
  • ...and 11 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Theorem 2
  • Corollary 3
  • Corollary 4
  • Lemma 1