A Framework for Evaluating Privacy-Utility Trade-off in Vertical Federated Learning
Yan Kang, Jiahuan Luo, Yuanqin He, Xiaojin Zhang, Lixin Fan, Qiang Yang
TL;DR
The paper presents a formal evaluation framework for privacy-utility trade-offs in vertical federated learning (VFL) and applies it to three widely deployed VFL algorithms (VLR, VHNN, VSNN). It standardizes the assessment of protection mechanisms against a broad set of privacy attacks, revealing that non-cryptographic protections can effectively thwart many attacks (e.g., NS, DS, DL) but struggle with model completion and model inversion in complex settings. Key findings indicate Marvell protection performs best against several VNN attacks, non-cryptographic protections struggle against MC, and VSNN is harder to protect than VHNN, with model structure secrecy helping mitigate MI. The study delivers practical guidance and a reusable codebase for practitioners to select protections and tune parameters for given privacy and utility requirements, highlighting that crypto-based protections may be preferred when efficiency allows.)
Abstract
Federated learning (FL) has emerged as a practical solution to tackle data silo issues without compromising user privacy. One of its variants, vertical federated learning (VFL), has recently gained increasing attention as the VFL matches the enterprises' demands of leveraging more valuable features to build better machine learning models while preserving user privacy. Current works in VFL concentrate on developing a specific protection or attack mechanism for a particular VFL algorithm. In this work, we propose an evaluation framework that formulates the privacy-utility evaluation problem. We then use this framework as a guide to comprehensively evaluate a broad range of protection mechanisms against most of the state-of-the-art privacy attacks for three widely deployed VFL algorithms. These evaluations may help FL practitioners select appropriate protection mechanisms given specific requirements. Our evaluation results demonstrate that: the model inversion and most of the label inference attacks can be thwarted by existing protection mechanisms; the model completion (MC) attack is difficult to be prevented, which calls for more advanced MC-targeted protection mechanisms. Based on our evaluation results, we offer concrete advice on improving the privacy-preserving capability of VFL systems. The code is available at https://github.com/yankang18/Attack-Defense-VFL
