Vertical Federated Learning: Concepts, Advances and Challenges
Yang Liu, Yan Kang, Tianyuan Zou, Yanhong Pu, Yuanqin He, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, Qiang Yang
TL;DR
This work surveys Vertical Federated Learning (VFL), a privacy-preserving paradigm where parties hold disjoint feature sets for the same users. It analyzes the VFL framework, problem formulation, and training protocols, and catalogs advances in efficiency, effectiveness, and privacy defenses, culminating in the VFLow optimization framework that jointly considers privacy, computation, communication, and fairness. The paper surveys attacks and defenses (cryptographic and non-cryptographic), data valuation, explainability, and fairness, and highlights industrial applications across finance, healthcare, and advertising, while outlining open challenges such as interoperability and trustworthy deployment. By providing a unified taxonomy and framework, the work guides future research toward robust, scalable, and auditable VFL systems with practical impact.
Abstract
Vertical Federated Learning (VFL) is a federated learning setting where multiple parties with different features about the same set of users jointly train machine learning models without exposing their raw data or model parameters. Motivated by the rapid growth in VFL research and real-world applications, we provide a comprehensive review of the concept and algorithms of VFL, as well as current advances and challenges in various aspects, including effectiveness, efficiency, and privacy. We provide an exhaustive categorization for VFL settings and privacy-preserving protocols and comprehensively analyze the privacy attacks and defense strategies for each protocol. In the end, we propose a unified framework, termed VFLow, which considers the VFL problem under communication, computation, privacy, as well as effectiveness and fairness constraints. Finally, we review the most recent advances in industrial applications, highlighting open challenges and future directions for VFL.
