Table of Contents
Fetching ...

Secure Vertical Federated Learning Under Unreliable Connectivity

Xinchi Qiu, Heng Pan, Wanru Zhao, Yan Gao, Pedro P. B. Gusmao, William F. Shen, Chenyang Ma, Nicholas D. Lane

TL;DR

This work tackles secure vertical federated learning under unreliable connectivity by introducing vFedSec, a dropout-tolerant framework built around a lightweight Secure Layer and embedding-padding. It defines a generalized VFL setting with multiple feature groups and an active party holding labels, and employs noise-based masking combined with quantization to achieve privacy without the heavy overhead of homomorphic encryption. The authors provide a formal privacy guarantee and demonstrate that dropout events can be tolerated without sacrificing convergence or accuracy, while achieving orders-of-magnitude improvements in computation and communication efficiency compared to HE-based baselines. Empirical results across four datasets show robust performance under dropout and high scalability, underscoring the practical impact for privacy-preserving distributed learning in realistic, connectivity-constrained environments.

Abstract

Most work in privacy-preserving federated learning (FL) has focused on horizontally partitioned datasets where clients hold the same features and train complete client-level models independently. However, individual data points are often scattered across different institutions, known as clients, in vertical FL (VFL) settings. Addressing this category of FL necessitates the exchange of intermediate outputs and gradients among participants, resulting in potential privacy leakage risks and slow convergence rates. Additionally, in many real-world scenarios, VFL training also faces the acute issue of client stragglers and drop-outs, a serious challenge that can significantly hinder the training process but has been largely overlooked in existing studies. In this work, we present vFedSec, a first dropout-tolerant VFL protocol, which can support the most generalized vertical framework. It achieves secure and efficient model training by using an innovative Secure Layer alongside an embedding-padding technique. We provide theoretical proof that our design attains enhanced security while maintaining training performance. Empirical results from extensive experiments also demonstrate vFedSec is robust to client dropout and provides secure training with negligible computation and communication overhead. Compared to widely adopted homomorphic encryption (HE) methods, our approach achieves a remarkable > 690x speedup and reduces communication costs significantly by > 9.6x.

Secure Vertical Federated Learning Under Unreliable Connectivity

TL;DR

This work tackles secure vertical federated learning under unreliable connectivity by introducing vFedSec, a dropout-tolerant framework built around a lightweight Secure Layer and embedding-padding. It defines a generalized VFL setting with multiple feature groups and an active party holding labels, and employs noise-based masking combined with quantization to achieve privacy without the heavy overhead of homomorphic encryption. The authors provide a formal privacy guarantee and demonstrate that dropout events can be tolerated without sacrificing convergence or accuracy, while achieving orders-of-magnitude improvements in computation and communication efficiency compared to HE-based baselines. Empirical results across four datasets show robust performance under dropout and high scalability, underscoring the practical impact for privacy-preserving distributed learning in realistic, connectivity-constrained environments.

Abstract

Most work in privacy-preserving federated learning (FL) has focused on horizontally partitioned datasets where clients hold the same features and train complete client-level models independently. However, individual data points are often scattered across different institutions, known as clients, in vertical FL (VFL) settings. Addressing this category of FL necessitates the exchange of intermediate outputs and gradients among participants, resulting in potential privacy leakage risks and slow convergence rates. Additionally, in many real-world scenarios, VFL training also faces the acute issue of client stragglers and drop-outs, a serious challenge that can significantly hinder the training process but has been largely overlooked in existing studies. In this work, we present vFedSec, a first dropout-tolerant VFL protocol, which can support the most generalized vertical framework. It achieves secure and efficient model training by using an innovative Secure Layer alongside an embedding-padding technique. We provide theoretical proof that our design attains enhanced security while maintaining training performance. Empirical results from extensive experiments also demonstrate vFedSec is robust to client dropout and provides secure training with negligible computation and communication overhead. Compared to widely adopted homomorphic encryption (HE) methods, our approach achieves a remarkable > 690x speedup and reduces communication costs significantly by > 9.6x.
Paper Structure (27 sections, 1 theorem, 7 equations, 4 figures, 6 tables)

This paper contains 27 sections, 1 theorem, 7 equations, 4 figures, 6 tables.

Key Result

Lemma 4.1

Let $U_i$ be the set of clients in group $i$ and the active client $C_0$. $\{h'_u\}_{u\in U}$ where $\forall u \in U, h_u \in \mathbb{Z}^m_R$ be the quantized intermediate embedding of each client; $n_{u,v}$ be the random noise associated with client $u$ and $v$ that is uniformly sampled from the fi where '$\equiv$' denotes that the distributions are identical.

Figures (4)

  • Figure 1: Detailed illustration of vFedSec with Secure Layer with embedding padding techniques during forward pass when drop-out occurs. A client group contains clients with the same set of features, which can be a single client or multiple clients. Each client group is responsible to updating part of the embedding vector, which will be fed to the top model on the server side.
  • Figure 2: Plots to show the performance in different drop-out setups. The experimental protocol can be found in Section \ref{['sec:expsetup']}. The partition number refers to the number of partitioned feature spaces.
  • Figure 3: Comparison of average CPU time for different batch sizes, using vFedSec and HE from Phe and SEAL-Python. The Y-axis is in the log scale. The results are from 10 experiments.
  • Figure 4: A high-level view of our protocol. Labels in the forward pass section are only required in training.

Theorems & Definitions (2)

  • Lemma 4.1
  • proof