Table of Contents
Fetching ...

A Framework for Evaluating Privacy-Utility Trade-off in Vertical Federated Learning

Yan Kang, Jiahuan Luo, Yuanqin He, Xiaojin Zhang, Lixin Fan, Qiang Yang

TL;DR

The paper presents a formal evaluation framework for privacy-utility trade-offs in vertical federated learning (VFL) and applies it to three widely deployed VFL algorithms (VLR, VHNN, VSNN). It standardizes the assessment of protection mechanisms against a broad set of privacy attacks, revealing that non-cryptographic protections can effectively thwart many attacks (e.g., NS, DS, DL) but struggle with model completion and model inversion in complex settings. Key findings indicate Marvell protection performs best against several VNN attacks, non-cryptographic protections struggle against MC, and VSNN is harder to protect than VHNN, with model structure secrecy helping mitigate MI. The study delivers practical guidance and a reusable codebase for practitioners to select protections and tune parameters for given privacy and utility requirements, highlighting that crypto-based protections may be preferred when efficiency allows.)

Abstract

Federated learning (FL) has emerged as a practical solution to tackle data silo issues without compromising user privacy. One of its variants, vertical federated learning (VFL), has recently gained increasing attention as the VFL matches the enterprises' demands of leveraging more valuable features to build better machine learning models while preserving user privacy. Current works in VFL concentrate on developing a specific protection or attack mechanism for a particular VFL algorithm. In this work, we propose an evaluation framework that formulates the privacy-utility evaluation problem. We then use this framework as a guide to comprehensively evaluate a broad range of protection mechanisms against most of the state-of-the-art privacy attacks for three widely deployed VFL algorithms. These evaluations may help FL practitioners select appropriate protection mechanisms given specific requirements. Our evaluation results demonstrate that: the model inversion and most of the label inference attacks can be thwarted by existing protection mechanisms; the model completion (MC) attack is difficult to be prevented, which calls for more advanced MC-targeted protection mechanisms. Based on our evaluation results, we offer concrete advice on improving the privacy-preserving capability of VFL systems. The code is available at https://github.com/yankang18/Attack-Defense-VFL

A Framework for Evaluating Privacy-Utility Trade-off in Vertical Federated Learning

TL;DR

The paper presents a formal evaluation framework for privacy-utility trade-offs in vertical federated learning (VFL) and applies it to three widely deployed VFL algorithms (VLR, VHNN, VSNN). It standardizes the assessment of protection mechanisms against a broad set of privacy attacks, revealing that non-cryptographic protections can effectively thwart many attacks (e.g., NS, DS, DL) but struggle with model completion and model inversion in complex settings. Key findings indicate Marvell protection performs best against several VNN attacks, non-cryptographic protections struggle against MC, and VSNN is harder to protect than VHNN, with model structure secrecy helping mitigate MI. The study delivers practical guidance and a reusable codebase for practitioners to select protections and tune parameters for given privacy and utility requirements, highlighting that crypto-based protections may be preferred when efficiency allows.)

Abstract

Federated learning (FL) has emerged as a practical solution to tackle data silo issues without compromising user privacy. One of its variants, vertical federated learning (VFL), has recently gained increasing attention as the VFL matches the enterprises' demands of leveraging more valuable features to build better machine learning models while preserving user privacy. Current works in VFL concentrate on developing a specific protection or attack mechanism for a particular VFL algorithm. In this work, we propose an evaluation framework that formulates the privacy-utility evaluation problem. We then use this framework as a guide to comprehensively evaluate a broad range of protection mechanisms against most of the state-of-the-art privacy attacks for three widely deployed VFL algorithms. These evaluations may help FL practitioners select appropriate protection mechanisms given specific requirements. Our evaluation results demonstrate that: the model inversion and most of the label inference attacks can be thwarted by existing protection mechanisms; the model completion (MC) attack is difficult to be prevented, which calls for more advanced MC-targeted protection mechanisms. Based on our evaluation results, we offer concrete advice on improving the privacy-preserving capability of VFL systems. The code is available at https://github.com/yankang18/Attack-Defense-VFL
Paper Structure (27 sections, 26 equations, 11 figures, 18 tables)

This paper contains 27 sections, 26 equations, 11 figures, 18 tables.

Figures (11)

  • Figure 1: Vertical federated learning
  • Figure 2: The adversary launches an attack $\mathcal{K}$ to infer the private data $\mathcal{D}$ from the privacy vulnerability $\langle\mathcal{W}\rangle$, which is protected by the protection mechanism $\mathcal{P}$. The privacy leakage $\epsilon_p$ measures the adversary's privacy payoff, whereas the utility loss $\epsilon_u$ measures the utility difference between the joint model $g$ and the protected one $\langle g \rangle$. $\langle\cdot\rangle$ denotes the protection operation.
  • Figure 3: The unified training procedure for VLR, VHNN and VSNN.
  • Figure 4: Comparison of PU trade-offs of DP-L, GC, D-SGD, MN and ISO against NS, DS and DL attacks on Credit and Vehicle in VLR. Subfigures in the first column are evaluations conducted on Credit while the ones in the second column are evaluations conducted on Vehicle.
  • Figure 5: Comparison of PU trade-offs of DP-L, GC, D-SGD, MN and ISO against RR and GI attacks on Credit and Vehicle in VLR. Subfigures in the first column are evaluations conducted on Credit while the ones in the second column are evaluations conducted on Vehicle.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 1: Privacy Leakage
  • Definition 2: Utility Loss