KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning

Marco Arazzi; Serena Nicolazzo; Antonino Nocera

KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning

Marco Arazzi, Serena Nicolazzo, Antonino Nocera

TL;DR

The paper tackles privacy risks in Vertical Federated Learning where label leakage can occur via gradients. It proposes KD$k$, a defense that blends Knowledge Distillation (KD) with $k$-anonymity to replace hard labels with anonymized soft-label sets produced by a teacher network, thereby obscuring the true label from potential attackers. Empirical results across multiple datasets show that label inference attack success rates drop substantially (often by over 60%) while the overall VFL accuracy remains nearly intact. KD$k$ also demonstrates superior robustness compared with existing defenses across passive, active, and direct label inference attacks, highlighting its practical impact for privacy-preserving VFL deployments.

Abstract

Vertical Federated Learning (VFL) is a category of Federated Learning in which models are trained collaboratively among parties with vertically partitioned data. Typically, in a VFL scenario, the labels of the samples are kept private from all the parties except for the aggregating server, that is the label owner. Nevertheless, recent works discovered that by exploiting gradient information returned by the server to bottom models, with the knowledge of only a small set of auxiliary labels on a very limited subset of training data points, an adversary can infer the private labels. These attacks are known as label inference attacks in VFL. In our work, we propose a novel framework called KDk, that combines Knowledge Distillation and k-anonymity to provide a defense mechanism against potential label inference attacks in a VFL scenario. Through an exhaustive experimental campaign we demonstrate that by applying our approach, the performance of the analyzed label inference attacks decreases consistently, even by more than 60%, maintaining the accuracy of the whole VFL almost unaltered.

KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning

TL;DR

The paper tackles privacy risks in Vertical Federated Learning where label leakage can occur via gradients. It proposes KD

, a defense that blends Knowledge Distillation (KD) with

-anonymity to replace hard labels with anonymized soft-label sets produced by a teacher network, thereby obscuring the true label from potential attackers. Empirical results across multiple datasets show that label inference attack success rates drop substantially (often by over 60%) while the overall VFL accuracy remains nearly intact. KD

also demonstrates superior robustness compared with existing defenses across passive, active, and direct label inference attacks, highlighting its practical impact for privacy-preserving VFL deployments.

Abstract

Paper Structure (23 sections, 6 equations, 7 figures, 9 tables, 1 algorithm)

This paper contains 23 sections, 6 equations, 7 figures, 9 tables, 1 algorithm.

Introduction
Related Work
Background
Federated Learning
k-Anonimity
Knowledge Distillation
Label Inference Attacks
Passive Label Inference Attack
Active Label Inference Attack
Direct Label Inference Attack
Approach Description
Experimental Results
Testbeds description
Label Inference Attacks Performance Comparison
Passive Label Inference Attack
...and 8 more sections

Figures (7)

Figure 1: The Federated Learning workflow
Figure 2: The three categories of FL divided for feature and sample spaces
Figure 3: Generic architecture of knowledge distillation using a teacher-student model
Figure 4: Label inference attack scenario against VFL
Figure 5: KD$k$ main components
...and 2 more figures

KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning

TL;DR

Abstract

KDk: A Defense Mechanism Against Label Inference Attacks in Vertical Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (7)