Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity

Md Ferdous Pervej; Richeng Jin; Huaiyu Dai

Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity

Md Ferdous Pervej, Richeng Jin, Huaiyu Dai

TL;DR

This paper tackles the challenge of bandwidth-scarce, heterogeneous wireless networks by proposing pruning-enabled hierarchical federated learning (PHFL) that injects model pruning into a four-tier FL stack (UE-VC-sBS-mBS-cloud). It derives a convergence bound that separates pruning errors and wireless-link effects, and then uses successive convex approximation (SCA) to jointly optimize pruning ratios, CPU frequency, and transmit power under strict delay and energy constraints. The authors demonstrate, through simulations on CIFAR-10/100 with CNN and ResNet architectures, that PHFL can substantially reduce training time, energy consumption, and bandwidth requirements while incurring only modest or negligible accuracy degradation, especially for bulky models. Overall, PHFL offers a practical pathway to enable efficient, privacy-preserving distributed learning in resource-constrained wireless networks with hierarchical aggregation and pruning-driven efficiency gains.

Abstract

While a practical wireless network has many tiers where end users do not directly communicate with the central server, the users' devices have limited computation and battery powers, and the serving base station (BS) has a fixed bandwidth. Owing to these practical constraints and system models, this paper leverages model pruning and proposes a pruning-enabled hierarchical federated learning (PHFL) in heterogeneous networks (HetNets). We first derive an upper bound of the convergence rate that clearly demonstrates the impact of the model pruning and wireless communications between the clients and the associated BS. Then we jointly optimize the model pruning ratio, central processing unit (CPU) frequency and transmission power of the clients in order to minimize the controllable terms of the convergence bound under strict delay and energy constraints. However, since the original problem is not convex, we perform successive convex approximation (SCA) and jointly optimize the parameters for the relaxed convex problem. Through extensive simulation, we validate the effectiveness of our proposed PHFL algorithm in terms of test accuracy, wall clock time, energy consumption and bandwidth requirement.

Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity

TL;DR

Abstract

Paper Structure (30 sections, 19 theorems, 105 equations, 7 figures, 7 tables, 2 algorithms)

This paper contains 30 sections, 19 theorems, 105 equations, 7 figures, 7 tables, 2 algorithms.

Introduction
Related Work
Our Contributions
System Model
Wireless Network Model
Pruning-Enabled Hierarchical Federated Learning (PHFL)
PHFL: Convergence Analysis
Assumptions
Convergence Analysis
Joint Problem Formulation and Solution
Problem Formulation
Problem Transformation
Simulation Results and Discussions
Simulation Setting
Performance Study
...and 15 more sections

Key Result

Theorem 1

When the assumptions in Section convergenceAssumptions hold and $\eta \leq 1/\beta$, we have where $\pmb{\delta}=\{ \delta_1^t, \dots, \delta_U^t \}_{t=0}^{T-1}$, $\pmb{\mathrm{f}}=\{\mathrm{f}_1^t, \dots, \mathrm{f}_U^t\}_{t=0}^{T-1}$, $\pmb{\mathrm{P}}=\{P_1^t,\dots, P_U^t\}_{t=0}^{T-1}$ and $\mathrm{f}_i^t$ is the $i^{\mathrm{th}}$ client's central processing unit (CPU) frequency in the

Figures (7)

Figure 1: Pruning-enabled hierarchical FL system model
Figure 2: CDF of clients' pruning ratios in different VCs for different $\mathrm{t^{th}}$ with different ML models
Figure 3: CDF of clients' $\mathrm{e}_i^{\mathrm{tot}}$'s in different VCs for different $\mathrm{t^{th}}$ with different ML models
Figure 4: Trade-offs between test accuracies and required bandwidth for different $\mathrm{t^{th}}$'s with different ML models
Figure 5: Test accuracies for different $\mathrm{t^{th}}$'s with different ML models on different datasets
...and 2 more figures

Theorems & Definitions (33)

Theorem 1
Remark 1
Remark 2
Lemma 1
Remark 3
Lemma 2
Remark 4
Lemma 3
Lemma 4
Corollary 1
...and 23 more

Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity

TL;DR

Abstract

Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (33)