Table of Contents
Fetching ...

A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT

Kai Li, Yilei Liang, Xin Yuan, Wei Ni, Jon Crowcroft, Chau Yuen, Ozgur B. Akan

TL;DR

HoVeFL tackles privacy and data heterogeneity in EdgeIoT by integrating horizontal FL on shared features with non-IID samples and vertical FL on diverse features with the same samples. The authors formulate a hybrid training objective and provide a FedAvg-like aggregation to produce a global model, along with a convergence analysis under the Polyak-Lojasiewicz condition. Empirical results on CIFAR-10 and SVHN show that, while Adam acceleration helps convergence, the hybrid setting incurs higher test losses compared to a pure vertical FL baseline, illustrating a trade-off between data heterogeneity handling and learning efficiency. The work demonstrates a path toward unified FL frameworks for privacy-preserving edge environments and provides theoretical and empirical insights for future optimization.

Abstract

This letter puts forth a new hybrid horizontal-vertical federated learning (HoVeFL) for mobile edge computing-enabled Internet of Things (EdgeIoT). In this framework, certain EdgeIoT devices train local models using the same data samples but analyze disparate data features, while the others focus on the same features using non-independent and identically distributed (non-IID) data samples. Thus, even though the data features are consistent, the data samples vary across devices. The proposed HoVeFL formulates the training of local and global models to minimize the global loss function. Performance evaluations on CIFAR-10 and SVHN datasets reveal that the testing loss of HoVeFL with 12 horizontal FL devices and six vertical FL devices is 5.5% and 25.2% higher, respectively, compared to a setup with six horizontal FL devices and 12 vertical FL devices.

A Novel Framework of Horizontal-Vertical Hybrid Federated Learning for EdgeIoT

TL;DR

HoVeFL tackles privacy and data heterogeneity in EdgeIoT by integrating horizontal FL on shared features with non-IID samples and vertical FL on diverse features with the same samples. The authors formulate a hybrid training objective and provide a FedAvg-like aggregation to produce a global model, along with a convergence analysis under the Polyak-Lojasiewicz condition. Empirical results on CIFAR-10 and SVHN show that, while Adam acceleration helps convergence, the hybrid setting incurs higher test losses compared to a pure vertical FL baseline, illustrating a trade-off between data heterogeneity handling and learning efficiency. The work demonstrates a path toward unified FL frameworks for privacy-preserving edge environments and provides theoretical and empirical insights for future optimization.

Abstract

This letter puts forth a new hybrid horizontal-vertical federated learning (HoVeFL) for mobile edge computing-enabled Internet of Things (EdgeIoT). In this framework, certain EdgeIoT devices train local models using the same data samples but analyze disparate data features, while the others focus on the same features using non-independent and identically distributed (non-IID) data samples. Thus, even though the data features are consistent, the data samples vary across devices. The proposed HoVeFL formulates the training of local and global models to minimize the global loss function. Performance evaluations on CIFAR-10 and SVHN datasets reveal that the testing loss of HoVeFL with 12 horizontal FL devices and six vertical FL devices is 5.5% and 25.2% higher, respectively, compared to a setup with six horizontal FL devices and 12 vertical FL devices.
Paper Structure (6 sections, 1 theorem, 17 equations, 3 figures)

This paper contains 6 sections, 1 theorem, 17 equations, 3 figures.

Key Result

Corollary 1

The convergence upper bound is convex with respect to the number of communication rounds, i.e., $T$, if $\mu_t \leq \frac{1}{L}$ and $\Theta \geq \frac{(1- 2 \rho \mu_t + 2 \rho L \mu_t^2) L \mu_t \sigma_2^2}{1 - L \mu_t}$.

Figures (3)

  • Figure 1: A training process of the local and global models in HoVeFL, where each data feature $i$ ($i \in [1, I]$) is trained by $N_i$ devices. Moreover, each device has data samples $j$, thus, we have $n_{i,j} \in \bf{N_i} \cap \bf{N_j}$.
  • Figure 2: The training and testing loss based on CIFAR-10.
  • Figure 3: The training and testing loss based on SVHN.

Theorems & Definitions (2)

  • Corollary 1
  • proof