Table of Contents
Fetching ...

Accelerating Vertical Federated Learning

Dongqi Cai, Tao Fan, Yan Kang, Lixin Fan, Mengwei Xu, Shangguang Wang, Qiang Yang

TL;DR

This work addresses the prohibitive computation and communication overhead of vertical federated learning (VFL) under homomorphic encryption (HE) in cross-silo settings. It proposes an accelerating system on the industrial FL platform FATE that combines a backup worker scheme with stale synchronous parallel (SSP) and principal component analysis-based feature compression to reduce both encrypted computation and network traffic. The approach achieves substantial gains, including up to $65.26\%$ communication reduction and up to $40.66\%$ computation reduction across heterogeneous networks, while preserving security. These results demonstrate that privacy-preserving VFL can be scaled to real-world cross-industry collaborations, offering a practical blueprint for deploying secure VFL at-scale.

Abstract

Privacy, security and data governance constraints rule out a brute force process in the integration of cross-silo data, which inherits the development of the Internet of Things. Federated learning is proposed to ensure that all parties can collaboratively complete the training task while the data is not out of the local. Vertical federated learning is a specialization of federated learning for distributed features. To preserve privacy, homomorphic encryption is applied to enable encrypted operations without decryption. Nevertheless, together with a robust security guarantee, homomorphic encryption brings extra communication and computation overhead. In this paper, we analyze the current bottlenecks of vertical federated learning under homomorphic encryption comprehensively and numerically. We propose a straggler-resilient and computation-efficient accelerating system that reduces the communication overhead in heterogeneous scenarios by 65.26% at most and reduces the computation overhead caused by homomorphic encryption by 40.66% at most. Our system can improve the robustness and efficiency of the current vertical federated learning framework without loss of security.

Accelerating Vertical Federated Learning

TL;DR

This work addresses the prohibitive computation and communication overhead of vertical federated learning (VFL) under homomorphic encryption (HE) in cross-silo settings. It proposes an accelerating system on the industrial FL platform FATE that combines a backup worker scheme with stale synchronous parallel (SSP) and principal component analysis-based feature compression to reduce both encrypted computation and network traffic. The approach achieves substantial gains, including up to communication reduction and up to computation reduction across heterogeneous networks, while preserving security. These results demonstrate that privacy-preserving VFL can be scaled to real-world cross-industry collaborations, offering a practical blueprint for deploying secure VFL at-scale.

Abstract

Privacy, security and data governance constraints rule out a brute force process in the integration of cross-silo data, which inherits the development of the Internet of Things. Federated learning is proposed to ensure that all parties can collaboratively complete the training task while the data is not out of the local. Vertical federated learning is a specialization of federated learning for distributed features. To preserve privacy, homomorphic encryption is applied to enable encrypted operations without decryption. Nevertheless, together with a robust security guarantee, homomorphic encryption brings extra communication and computation overhead. In this paper, we analyze the current bottlenecks of vertical federated learning under homomorphic encryption comprehensively and numerically. We propose a straggler-resilient and computation-efficient accelerating system that reduces the communication overhead in heterogeneous scenarios by 65.26% at most and reduces the computation overhead caused by homomorphic encryption by 40.66% at most. Our system can improve the robustness and efficiency of the current vertical federated learning framework without loss of security.
Paper Structure (24 sections, 11 equations, 16 figures, 1 table, 2 algorithms)

This paper contains 24 sections, 11 equations, 16 figures, 1 table, 2 algorithms.

Figures (16)

  • Figure 1: Vertical federated learning. A specialization of federated learning for secure cross-silo cooperation learning.
  • Figure 2: The framework of vertical federated learning, where homomorphic encryption is applied to preserve privacy.
  • Figure 3: Runtime breakdown under different VFL settings.
  • Figure 4: Effect of backup workers: communication speedup. $Clean$ means no heterogeneity, $B_{i}$ means the number of backup workers is $i$ in various heterogeneity (different possibilities of slow down). Dataset: MIMIC-III.
  • Figure 5: MIMIC
  • ...and 11 more figures