Table of Contents
Fetching ...

Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend

Yige Liu, Dexuan Xu, Zimai Guo, Yongzhi Cao, Hanpin Wang

Abstract

Vertical federated learning (VFL) allows an active party with a top model, and multiple passive parties with bottom models to collaborate. In this scenario, passive parties possessing only features may attempt to infer active party's private labels, making label inference attacks (LIAs) a significant threat. Previous LIA studies have claimed that well-trained bottom models can effectively represent labels. However, we demonstrate that this view is misleading and exposes the vulnerability of existing LIAs. By leveraging mutual information, we present the first observation of the "model compensation" phenomenon in VFL. We theoretically prove that, in VFL, the mutual information between layer outputs and labels increases with layer depth, indicating that bottom models primarily extract feature information while the top model handles label mapping. Building on this insight, we introduce task reassignment to show that the success of existing LIAs actually stems from the distribution alignment between features and labels. When this alignment is disrupted, the performance of LIAs declines sharply or even fails entirely. Furthermore, the implications of this insight for defenses are also investigated. We propose a zero-overhead defense technique based on layer adjustment. Extensive experiments across five datasets and five representative model architectures indicate that shifting cut layers forward to increase the proportion of top model layers in the entire model not only improves resistance to LIAs but also enhances other defenses.

Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend

Abstract

Vertical federated learning (VFL) allows an active party with a top model, and multiple passive parties with bottom models to collaborate. In this scenario, passive parties possessing only features may attempt to infer active party's private labels, making label inference attacks (LIAs) a significant threat. Previous LIA studies have claimed that well-trained bottom models can effectively represent labels. However, we demonstrate that this view is misleading and exposes the vulnerability of existing LIAs. By leveraging mutual information, we present the first observation of the "model compensation" phenomenon in VFL. We theoretically prove that, in VFL, the mutual information between layer outputs and labels increases with layer depth, indicating that bottom models primarily extract feature information while the top model handles label mapping. Building on this insight, we introduce task reassignment to show that the success of existing LIAs actually stems from the distribution alignment between features and labels. When this alignment is disrupted, the performance of LIAs declines sharply or even fails entirely. Furthermore, the implications of this insight for defenses are also investigated. We propose a zero-overhead defense technique based on layer adjustment. Extensive experiments across five datasets and five representative model architectures indicate that shifting cut layers forward to increase the proportion of top model layers in the entire model not only improves resistance to LIAs but also enhances other defenses.
Paper Structure (20 sections, 16 equations, 11 figures, 2 tables)

This paper contains 20 sections, 16 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: An illustration of the vertical federated learning framework. Solid lines represent forward propagation and dashed lines denote backpropagation.
  • Figure 2: Mutual information between the outputs of each layer and the labels for different datasets and models. In each subfigure, the left side of the gray dotted line corresponds to the bottom model, while the right side corresponds to the top model. Lines of different colors indicate different passive parties. From left to right, the number of passive parties gradually increases.
  • Figure 3: The mutual information between the cut layer of the bottom model and the labels decreases as the number of passive parties increases.
  • Figure 4: The topology of the lumped Markov chains in VFL.
  • Figure 5: Trends of attack accuracy and mutual information under different tasks. The first column presents the attack accuracy of LIA with cluster method. The second column presents the attack accuracy of LIA with model completion method. The third column presents the mutual information between the dataset features and the labels.
  • ...and 6 more figures