Table of Contents
Fetching ...

Cooperative Decentralized Backdoor Attacks on Vertical Federated Learning

Seohyun Lee, Wenzhi Fang, Anindya Bijoy Das, Seyyedali Hosseinalipour, David J. Love, Christopher G. Brinton

TL;DR

The paper addresses backdoor attacks in vertical federated learning by proposing a server-free, cooperative attack where multiple adversaries collude over a graph to inflict a backdoor without relying on server gradients. It combines a variational autoencoder (VAE) with metric learning to locally infer target-label datapoints and uses a graph-based consensus to select datapoints for poisoning, followed by an intensity-based, split-trigger embedding across adversaries. The authors provide a convergence analysis showing a stationarity-gap bound that scales with gradient perturbation $\delta(\rho)$, which increases with graph connectivity $\rho$, and validate the approach with extensive experiments across five image datasets, where the proposed method achieves higher attack success rates (ASR) than baselines while maintaining clean-task accuracy (CDA) and showing robustness to gradient-noise defenses. The results demonstrate that higher adversary connectivity and trigger intensity enhance attack potency, highlighting security risks in cross-device VFL and underscoring the need for defenses focusing on decentralized label inference and collaboration patterns.

Abstract

Federated learning (FL) is vulnerable to backdoor attacks, where adversaries alter model behavior on target classification labels by embedding triggers into data samples. While these attacks have received considerable attention in horizontal FL, they are less understood for vertical FL (VFL), where devices hold different features of the samples, and only the server holds the labels. In this work, we propose a novel backdoor attack on VFL which (i) does not rely on gradient information from the server and (ii) considers potential collusion among multiple adversaries for sample selection and trigger embedding. Our label inference model augments variational autoencoders with metric learning, which adversaries can train locally. A consensus process over the adversary graph topology determines which datapoints to poison. We further propose methods for trigger splitting across the adversaries, with an intensity-based implantation scheme skewing the server towards the trigger. Our convergence analysis reveals the impact of backdoor perturbations on VFL indicated by a stationarity gap for the trained model, which we verify empirically as well. We conduct experiments comparing our attack with recent backdoor VFL approaches, finding that ours obtains significantly higher success rates for the same main task performance despite not using server information. Additionally, our results verify the impact of collusion on attack performance.

Cooperative Decentralized Backdoor Attacks on Vertical Federated Learning

TL;DR

The paper addresses backdoor attacks in vertical federated learning by proposing a server-free, cooperative attack where multiple adversaries collude over a graph to inflict a backdoor without relying on server gradients. It combines a variational autoencoder (VAE) with metric learning to locally infer target-label datapoints and uses a graph-based consensus to select datapoints for poisoning, followed by an intensity-based, split-trigger embedding across adversaries. The authors provide a convergence analysis showing a stationarity-gap bound that scales with gradient perturbation , which increases with graph connectivity , and validate the approach with extensive experiments across five image datasets, where the proposed method achieves higher attack success rates (ASR) than baselines while maintaining clean-task accuracy (CDA) and showing robustness to gradient-noise defenses. The results demonstrate that higher adversary connectivity and trigger intensity enhance attack potency, highlighting security risks in cross-device VFL and underscoring the need for defenses focusing on decentralized label inference and collaboration patterns.

Abstract

Federated learning (FL) is vulnerable to backdoor attacks, where adversaries alter model behavior on target classification labels by embedding triggers into data samples. While these attacks have received considerable attention in horizontal FL, they are less understood for vertical FL (VFL), where devices hold different features of the samples, and only the server holds the labels. In this work, we propose a novel backdoor attack on VFL which (i) does not rely on gradient information from the server and (ii) considers potential collusion among multiple adversaries for sample selection and trigger embedding. Our label inference model augments variational autoencoders with metric learning, which adversaries can train locally. A consensus process over the adversary graph topology determines which datapoints to poison. We further propose methods for trigger splitting across the adversaries, with an intensity-based implantation scheme skewing the server towards the trigger. Our convergence analysis reveals the impact of backdoor perturbations on VFL indicated by a stationarity gap for the trained model, which we verify empirically as well. We conduct experiments comparing our attack with recent backdoor VFL approaches, finding that ours obtains significantly higher success rates for the same main task performance despite not using server information. Additionally, our results verify the impact of collusion on attack performance.
Paper Structure (26 sections, 1 theorem, 19 equations, 6 figures, 8 tables, 2 algorithms)

This paper contains 26 sections, 1 theorem, 19 equations, 6 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

Suppose that the above assumptions hold, and the learning rate is upper bounded as $\eta_1^{(t)} = \eta_2^{(t)} = \cdots = \eta_{K}^{(t)} = \eta^{(t)} \leq \frac{1}{4L}$. Then, the iterates generated by the backdoored SplitNN and vanilla SplitNN satisfy

Figures (6)

  • Figure 1: Client-server sharing of embeddings and gradients in VFL. An example of a feature-partitioned datapoint undergoing a backdoor trigger implantation is shown. The adversaries send up their poisoned embeddings, which are then concatenated by the server and cause misclassification. Moreover, adversaries form a graph amongst each other, sharing their feature partitions to enhance insights on the samples they wish to poison.
  • Figure 2: Label inference methodology with modified VAE architecture. Initially, the adversaries share their feature partitions. At the $\mu$ layer of the VAE, triplet margin loss is employed to conduct metric learning via the known label datapoints. After training the VAE, the $\mu$ vectors are used to perform a classification task for inference of the target label, with the results of local inference being used in a majority voting scheme for a final collaborative inference of the indices.
  • Figure 3: Image generation and trigger-embedding process. The adversaries can choose one of two methods: (1) constructing a collaborative trigger on some position of the known adversary features, or (2) giving each adversary a smaller trigger. Method 1 may result in some adversaries not possessing any portion of the trigger pattern, i.e., only having the background.
  • Figure 4: Attack Success Rate (ASR) and Clean Data Accuracy (CDA) for MNIST, Fashion-MNIST, CIFAR-10, SVHN, and CIFAR100-20. The proposed method converges to a higher ASR value than the baselines (BadVFL xuan2023practical and VILLAIN bai2023villain) due to (1) having a higher label inference accuracy as seen in Fig. \ref{['fig:label-inf']} and (2) having an intensity based trigger that makes it easier for the server to draw the association between the target label and trigger.
  • Figure 5: (a) The accuracy of label inference across all three scenarios. The proposed method reaches higher accuracy than the baselines (BadVFL xuan2023practical and VILLAIN bai2023villain) even without access to server information. (b) Impact of varying number of adversaries with Fashion-MNIST and MNIST. As the number of adversaries increases, a general trend in the increase of the ASR is noticed. In addition, the proposed attack is consistently has a higher or comparable ASR value to the baselines. (c) Varying trigger intensity parameter $\gamma$. We notice that an increase in $\gamma$ allows for a greater success of a backdoor attack. (d) Performance of the ASR with different levels of graph connectivity. In general, the more connected the graph is, the better the performance of the backdoor attack.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 1