Table of Contents
Fetching ...

Tree-based Models for Vertical Federated Learning: A Survey

Bingchen Qian, Yuexiang Xie, Yaliang Li, Bolin Ding, Jingren Zhou

TL;DR

This article categorizes tree-based models in VFL into two types, i.e., feature-gathering models and label-scattering models, and provides a detailed discussion regarding their characteristics, advantages, privacy protection mechanisms, and applications.

Abstract

Tree-based models have achieved great success in a wide range of real-world applications due to their effectiveness, robustness, and interpretability, which inspired people to apply them in vertical federated learning (VFL) scenarios in recent years. In this paper, we conduct a comprehensive study to give an overall picture of applying tree-based models in VFL, from the perspective of their communication and computation protocols. We categorize tree-based models in VFL into two types, i.e., feature-gathering models and label-scattering models, and provide a detailed discussion regarding their characteristics, advantages, privacy protection mechanisms, and applications. This study also focuses on the implementation of tree-based models in VFL, summarizing several design principles for better satisfying various requirements from both academic research and industrial deployment. We conduct a series of experiments to provide empirical observations on the differences and advances of different types of tree-based models.

Tree-based Models for Vertical Federated Learning: A Survey

TL;DR

This article categorizes tree-based models in VFL into two types, i.e., feature-gathering models and label-scattering models, and provides a detailed discussion regarding their characteristics, advantages, privacy protection mechanisms, and applications.

Abstract

Tree-based models have achieved great success in a wide range of real-world applications due to their effectiveness, robustness, and interpretability, which inspired people to apply them in vertical federated learning (VFL) scenarios in recent years. In this paper, we conduct a comprehensive study to give an overall picture of applying tree-based models in VFL, from the perspective of their communication and computation protocols. We categorize tree-based models in VFL into two types, i.e., feature-gathering models and label-scattering models, and provide a detailed discussion regarding their characteristics, advantages, privacy protection mechanisms, and applications. This study also focuses on the implementation of tree-based models in VFL, summarizing several design principles for better satisfying various requirements from both academic research and industrial deployment. We conduct a series of experiments to provide empirical observations on the differences and advances of different types of tree-based models.

Paper Structure

This paper contains 47 sections, 11 equations, 13 figures, 4 tables, 2 algorithms.

Figures (13)

  • Figure 1: Communication and computation behaviors happen in a training round of Horizontal and Vertical FL.
  • Figure 2: Overview of tree-based models for vertical federated learning.
  • Figure 3: Comparisons between feature-gathering and label-scattering TBMs.
  • Figure 4: Illustration of the privacy protection algorithms proposed by FederBoost tian2020federboost.
  • Figure 5: Illustration of the privacy protection algorithms proposed by OpBoost li2022opboost.
  • ...and 8 more figures