Table of Contents
Fetching ...

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

Chulin Xie, Pin-Yu Chen, Qinbin Li, Arash Nourian, Ce Zhang, Bo Li

TL;DR

This work tackles the heavy communication and privacy-budget costs in vertical federated learning by introducing VIM, a framework with multiple server heads and ADMM-based optimization. By decomposing the global objective into client-specific subproblems, VIMADMM enables many local updates per round and exchanges only ADMM variables, reducing communication and improving DP-utility. The paper provides convergence guarantees and DP analyses, and demonstrates strong empirical gains across four datasets, including scenarios with client-level DP and label DP, while also enabling client-level explainability and denoising. Overall, VIM offers a scalable, privacy-aware approach to high-performance VFL with transparent client contributions and robust DP guarantees.

Abstract

Federated learning (FL) enables distributed resource-constrained devices to jointly train shared models while keeping the training data local for privacy purposes. Vertical FL (VFL), which allows each client to collect partial features, has attracted intensive research efforts recently. We identified the main challenges that existing VFL frameworks are facing: the server needs to communicate gradients with the clients for each training step, incurring high communication cost that leads to rapid consumption of privacy budgets. To address these challenges, in this paper, we introduce a VFL framework with multiple heads (VIM), which takes the separate contribution of each client into account, and enables an efficient decomposition of the VFL optimization objective to sub-objectives that can be iteratively tackled by the server and the clients on their own. In particular, we propose an Alternating Direction Method of Multipliers (ADMM)-based method to solve our optimization problem, which allows clients to conduct multiple local updates before communication, and thus reduces the communication cost and leads to better performance under differential privacy (DP). We provide the user-level DP mechanism for our framework to protect user privacy. Moreover, we show that a byproduct of VIM is that the weights of learned heads reflect the importance of local clients. We conduct extensive evaluations and show that on four vertical FL datasets, VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art. We also explicitly evaluate the importance of local clients and show that VIM enables functionalities such as client-level explanation and client denoising. We hope this work will shed light on a new way of effective VFL training and understanding.

Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM

TL;DR

This work tackles the heavy communication and privacy-budget costs in vertical federated learning by introducing VIM, a framework with multiple server heads and ADMM-based optimization. By decomposing the global objective into client-specific subproblems, VIMADMM enables many local updates per round and exchanges only ADMM variables, reducing communication and improving DP-utility. The paper provides convergence guarantees and DP analyses, and demonstrates strong empirical gains across four datasets, including scenarios with client-level DP and label DP, while also enabling client-level explainability and denoising. Overall, VIM offers a scalable, privacy-aware approach to high-performance VFL with transparent client contributions and robust DP guarantees.

Abstract

Federated learning (FL) enables distributed resource-constrained devices to jointly train shared models while keeping the training data local for privacy purposes. Vertical FL (VFL), which allows each client to collect partial features, has attracted intensive research efforts recently. We identified the main challenges that existing VFL frameworks are facing: the server needs to communicate gradients with the clients for each training step, incurring high communication cost that leads to rapid consumption of privacy budgets. To address these challenges, in this paper, we introduce a VFL framework with multiple heads (VIM), which takes the separate contribution of each client into account, and enables an efficient decomposition of the VFL optimization objective to sub-objectives that can be iteratively tackled by the server and the clients on their own. In particular, we propose an Alternating Direction Method of Multipliers (ADMM)-based method to solve our optimization problem, which allows clients to conduct multiple local updates before communication, and thus reduces the communication cost and leads to better performance under differential privacy (DP). We provide the user-level DP mechanism for our framework to protect user privacy. Moreover, we show that a byproduct of VIM is that the weights of learned heads reflect the importance of local clients. We conduct extensive evaluations and show that on four vertical FL datasets, VIM achieves significantly higher performance and faster convergence compared with the state-of-the-art. We also explicitly evaluate the importance of local clients and show that VIM enables functionalities such as client-level explanation and client denoising. We hope this work will shed light on a new way of effective VFL training and understanding.
Paper Structure (69 sections, 13 theorems, 62 equations, 5 figures, 14 tables, 2 algorithms)

This paper contains 69 sections, 13 theorems, 62 equations, 5 figures, 14 tables, 2 algorithms.

Key Result

Theorem 1

Assume that $\mathcal{L}_{\mathrm{\texttt{VIM}\xspace}}$ is bounded from below, that is $\underline{e} := \min_{\{\theta_k\}, \{W_k\} } \mathcal{L}_{\mathrm{\texttt{VIM}\xspace}} (\{\theta_k\}, \{W_k\}) > - \infty$. Assume that $\ell(z;\cdot)$ is $L$-Lipschitz smooth w.r.t $z$ and $\mathcal{L}_ (B) Let $( \{W_k^{*}\}, \{\theta_k^{*}\} ,\{z_j^{*}\},\{\lambda_j^{*}\})$ denote any limit points o

Figures (5)

  • Figure 1: Test accuracy of VFL methods under with model (first row) and without splitting (second row) settings on four datasets. Our methods (VIMADMM and VIMADMM-J) outperforms baselines due to multiple local updates enabled by ADMM ($\tau>1$). Compared with FedBCD under different number of local steps $\tau$, VIMADMM also achieves faster convergence and higher accuracy, which shows that the strategic utilization of ADMM-related variables in VIMADMM is more effective than the stale partial gradient in FedBCD for local updates.
  • Figure 2: Performance comparison when the server has the non-linear MLP model. ADMM-based method still outperforms other baselines under general architectures with the non-linear server model.
  • Figure 3: Client-level explainability of VIM. Row 1 visualizes the input features. Row 2 shows the weights norm of linear heads. Row 3 shows the test accuracy when each client's test input features are perturbed (red line denotes the clean test accuracy). Row 4 shows the weights norm of linear heads under only one noisy client.
  • Figure 4: T-SNE visualizations of local embeddings from important client and unimportant client for VIMADMM.
  • Figure 5: Performance of VIMADMM with different penalty factor $\rho$ on four datasets. VIMADMM is not sensitive to $\rho$ from 0.5 to 2.

Theorems & Definitions (26)

  • Theorem 1
  • proof : Proof Sketch
  • Definition 1: $(\epsilon,\delta)$-DP dwork2014algorithmic
  • Definition 2: Client-level $(\epsilon,\delta)$-DP mcmahan2018learning
  • Theorem 2
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • ...and 16 more