Table of Contents
Fetching ...

URVFL: Undetectable Data Reconstruction Attack on Vertical Federated Learning

Duanyi Yao, Songze Li, Xueluan Gong, Sizai Hou, Gaoning Pan

TL;DR

URVFL presents a novel undetectable data reconstruction attack for vertical federated learning by integrating a discriminator with an auxiliary classifier (DAC) to leverage label information for malicious gradient generation. The attack pretrains an encoder/decoder on auxiliary data and then uses DAC to align the victim’s embedding distribution with the encoder’s, enabling accurate reconstruction of target features while remaining stealthy under state-of-the-art detectors. Across five representative datasets, URVFL and its synchronized variant demonstrate superior reconstruction quality and robustness to detections, outperforming existing malicious and HBC attacks. The work highlights a critical privacy risk in VFL and suggests that defenses must balance privacy guarantees with maintaining honest training performance.

Abstract

Launching effective malicious attacks in VFL presents unique challenges: 1) Firstly, given the distributed nature of clients' data features and models, each client rigorously guards its privacy and prohibits direct querying, complicating any attempts to steal data; 2) Existing malicious attacks alter the underlying VFL training task, and are hence easily detected by comparing the received gradients with the ones received in honest training. To overcome these challenges, we develop URVFL, a novel attack strategy that evades current detection mechanisms. The key idea is to integrate a discriminator with auxiliary classifier that takes a full advantage of the label information and generates malicious gradients to the victim clients: on one hand, label information helps to better characterize embeddings of samples from distinct classes, yielding an improved reconstruction performance; on the other hand, computing malicious gradients with label information better mimics the honest training, making the malicious gradients indistinguishable from the honest ones, and the attack much more stealthy. Our comprehensive experiments demonstrate that URVFL significantly outperforms existing attacks, and successfully circumvents SOTA detection methods for malicious attacks. Additional ablation studies and evaluations on defenses further underscore the robustness and effectiveness of URVFL. Our code will be available at https://github.com/duanyiyao/URVFL.

URVFL: Undetectable Data Reconstruction Attack on Vertical Federated Learning

TL;DR

URVFL presents a novel undetectable data reconstruction attack for vertical federated learning by integrating a discriminator with an auxiliary classifier (DAC) to leverage label information for malicious gradient generation. The attack pretrains an encoder/decoder on auxiliary data and then uses DAC to align the victim’s embedding distribution with the encoder’s, enabling accurate reconstruction of target features while remaining stealthy under state-of-the-art detectors. Across five representative datasets, URVFL and its synchronized variant demonstrate superior reconstruction quality and robustness to detections, outperforming existing malicious and HBC attacks. The work highlights a critical privacy risk in VFL and suggests that defenses must balance privacy guarantees with maintaining honest training performance.

Abstract

Launching effective malicious attacks in VFL presents unique challenges: 1) Firstly, given the distributed nature of clients' data features and models, each client rigorously guards its privacy and prohibits direct querying, complicating any attempts to steal data; 2) Existing malicious attacks alter the underlying VFL training task, and are hence easily detected by comparing the received gradients with the ones received in honest training. To overcome these challenges, we develop URVFL, a novel attack strategy that evades current detection mechanisms. The key idea is to integrate a discriminator with auxiliary classifier that takes a full advantage of the label information and generates malicious gradients to the victim clients: on one hand, label information helps to better characterize embeddings of samples from distinct classes, yielding an improved reconstruction performance; on the other hand, computing malicious gradients with label information better mimics the honest training, making the malicious gradients indistinguishable from the honest ones, and the attack much more stealthy. Our comprehensive experiments demonstrate that URVFL significantly outperforms existing attacks, and successfully circumvents SOTA detection methods for malicious attacks. Additional ablation studies and evaluations on defenses further underscore the robustness and effectiveness of URVFL. Our code will be available at https://github.com/duanyiyao/URVFL.
Paper Structure (28 sections, 6 equations, 10 figures, 11 tables, 3 algorithms)

This paper contains 28 sections, 6 equations, 10 figures, 11 tables, 3 algorithms.

Figures (10)

  • Figure 1: Illustration of a VFL system with one active client and two passive clients. The area enclosed by the red dashed lines contains the information accessible to and the actions taken by the active client. In a data reconstruction attack, the malicious active client intends to recover the private features of the target passive clients.
  • Figure 2: Comparison of Splitguard scores for two malicious attacks, and their variants with modified loss functions.
  • Figure 3: Workflow of URVFL. The opaque rectangle indicates that the model is being trained, while the transparent rectangle represents that the model is frozen and only utilized in forward propagation.
  • Figure 4: t-SNE visualization on Credit dataset with 2 classes.
  • Figure 5: t-SNE visualization on MNIST dataset with 10 classes.
  • ...and 5 more figures