Table of Contents
Fetching ...

Leverage Variational Graph Representation For Model Poisoning on Federated Learning

Kai Li, Xin Yuan, Jingjing Zheng, Wei Ni, Falko Dressler, Abbas Jamalipour

TL;DR

This work tackles model-poisoning in federated learning when attackers can only overhear benign local updates. It proposes VGAE-MP, an adversarial variational graph autoencoder that learns the graph-structured correlations among benign updates and regenerates a malicious update $oldsymbol{w}_j'(t)$ that degrades the global model while remaining close to the aggregated update for stealth. Key contributions include (i) a data-untethered poisoning framework that exploits feature correlations, (ii) a dual-variable optimization scheme with a knapsack-based bandwidth selection to control attack scope, (iii) a VGAE with a two-layer GCN encoder and inner-product decoder to maximize a reconstruction loss, and (iv) extensive experiments on MNIST, FashionMNIST, and CIFAR-10 showing gradual FL accuracy loss and detection evasion. The results highlight a practical threat in wireless FL settings, where eavesdropping enables potent, hard-to-detect poisoning that challenges current defenses and motivates new graph-aware protections.

Abstract

This paper puts forth a new training data-untethered model poisoning (MP) attack on federated learning (FL). The new MP attack extends an adversarial variational graph autoencoder (VGAE) to create malicious local models based solely on the benign local models overheard without any access to the training data of FL. Such an advancement leads to the VGAE-MP attack that is not only efficacious but also remains elusive to detection. VGAE-MP attack extracts graph structural correlations among the benign local models and the training data features, adversarially regenerates the graph structure, and generates malicious local models using the adversarial graph structure and benign models' features. Moreover, a new attacking algorithm is presented to train the malicious local models using VGAE and sub-gradient descent, while enabling an optimal selection of the benign local models for training the VGAE. Experiments demonstrate a gradual drop in FL accuracy under the proposed VGAE-MP attack and the ineffectiveness of existing defense mechanisms in detecting the attack, posing a severe threat to FL.

Leverage Variational Graph Representation For Model Poisoning on Federated Learning

TL;DR

This work tackles model-poisoning in federated learning when attackers can only overhear benign local updates. It proposes VGAE-MP, an adversarial variational graph autoencoder that learns the graph-structured correlations among benign updates and regenerates a malicious update that degrades the global model while remaining close to the aggregated update for stealth. Key contributions include (i) a data-untethered poisoning framework that exploits feature correlations, (ii) a dual-variable optimization scheme with a knapsack-based bandwidth selection to control attack scope, (iii) a VGAE with a two-layer GCN encoder and inner-product decoder to maximize a reconstruction loss, and (iv) extensive experiments on MNIST, FashionMNIST, and CIFAR-10 showing gradual FL accuracy loss and detection evasion. The results highlight a practical threat in wireless FL settings, where eavesdropping enables potent, hard-to-detect poisoning that challenges current defenses and motivates new graph-aware protections.

Abstract

This paper puts forth a new training data-untethered model poisoning (MP) attack on federated learning (FL). The new MP attack extends an adversarial variational graph autoencoder (VGAE) to create malicious local models based solely on the benign local models overheard without any access to the training data of FL. Such an advancement leads to the VGAE-MP attack that is not only efficacious but also remains elusive to detection. VGAE-MP attack extracts graph structural correlations among the benign local models and the training data features, adversarially regenerates the graph structure, and generates malicious local models using the adversarial graph structure and benign models' features. Moreover, a new attacking algorithm is presented to train the malicious local models using VGAE and sub-gradient descent, while enabling an optimal selection of the benign local models for training the VGAE. Experiments demonstrate a gradual drop in FL accuracy under the proposed VGAE-MP attack and the ineffectiveness of existing defense mechanisms in detecting the attack, posing a severe threat to FL.
Paper Structure (16 sections, 24 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 24 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) Illustration of FL, where a local model update is trained at each benign user device based on its datasets. The edge server aggregates the benign local model updates to train a global model that will be broadcast to the user devices to update the training parameters of their local models. (b) By eavesdropping on the benign local model updates, the attacker performs the proposed VGAE-MP attack to create a malicious poisoning model that is sent to the server. The malicious model deviates the FL in the opposite direction, thereby falsifying the local model updates of the devices.
  • Figure 2: The proposed VGAE-MP attack creates $\pmb{w}_j^\prime(t)$ based on learning the correlation among the parameters of the models being trained in FL, i.e., $\pmb{w}_i(t),\,\forall i$. A graph encoder trains $\pmb{\mathcal{F}}_j$ and $\pmb{\mathcal{A}}_j$ to build a feature representation matrix $\pmb{\mathcal{Z}}$. The output of the encoder inputs to the decoder for the reconstruction of $\pmb{\mathcal{A}}_j$. The VGAE-MP attack is designed to adjust $\pmb{w}_j^\prime$ to maximize the reconstruction loss $\eta_{\rm loss}$, according to \ref{['eq_reconError']}.
  • Figure 3: Given 100 FL communication rounds, $I$ = 5 and $J$ = 2, we study the local model's testing accuracy under the proposed VGAE-MP attack on the MNIST, FashionMNIST, and CIFAR-10 datasets.
  • Figure 4: The global model's testing accuracy ("avg" means the average value and "std" stands for the standard deviation) under the VGAE-MP attack on the MNIST, FashionMNIST, and CIFAR-10 datasets.
  • Figure 5: Given the MNIST, FashionMNIST, and CIFAR-10 datasets, the average testing accuracy of the local models under the VGAE-MP attack when $J$ increases from 1 to 5.
  • ...and 3 more figures