Data Poisoning Attacks in Gossip Learning

Alexandre Pham; Maria Potop-Butucaru; Sébastien Tixeuil; Serge Fdida

Data Poisoning Attacks in Gossip Learning

Alexandre Pham, Maria Potop-Butucaru, Sébastien Tixeuil, Serge Fdida

TL;DR

This work is the first to propose a methodology to assess poisoning attacks in Decentralized Federated Learning in both churn free and churn prone scenarios and extended the gossipy simulator with an attack injector module.

Abstract

Traditional machine learning systems were designed in a centralized manner. In such designs, the central entity maintains both the machine learning model and the data used to adjust the model's parameters. As data centralization yields privacy issues, Federated Learning was introduced to reduce data sharing and have a central server coordinate the learning of multiple devices. While Federated Learning is more decentralized, it still relies on a central entity that may fail or be subject to attacks, provoking the failure of the whole system. Then, Decentralized Federated Learning removes the need for a central server entirely, letting participating processes handle the coordination of the model construction. This distributed control urges studying the possibility of malicious attacks by the participants themselves. While poisoning attacks on Federated Learning have been extensively studied, their effects in Decentralized Federated Learning did not get the same level of attention. Our work is the first to propose a methodology to assess poisoning attacks in Decentralized Federated Learning in both churn free and churn prone scenarios. Furthermore, in order to evaluate our methodology on a case study representative for gossip learning we extended the gossipy simulator with an attack injector module.

Data Poisoning Attacks in Gossip Learning

TL;DR

Abstract

Paper Structure (5 sections, 5 figures)

This paper contains 5 sections, 5 figures.

Introduction
Case study: State-of-the-art Gossip Learning
Methodology
Simulation results
Conclusion

Figures (5)

Figure 1: Examples of clean and tampered data from the MNIST dataset.
Figure 2: Accuracy on the test set (left, higher is better) and backdoor set (right, lower is better) for different topologies with $n = 100$ (gray) and $n=150$ (black) with random Byzantine placement strategy in a churn-free system with $f = 30 \text{ and } 45$ respectively. 'Baseline' curves represent the results in Byzantine-free simulations, with the best choice of $S$ among values studied here with Byzantine.
Figure 3: Accuracy on the test set and backdoor set for different topologies with $n = 150$ and $f=45$ with random Byzantine placement strategy (gray) and classical Byzantine placement strategy (black).
Figure 4: Accuracy on the test and backdoor set for $n=150$, $S=8$ for $f \in \{0, 5, 15, 20, 25, 40, 45\}$ with the random (gray) and classical (black) placement strategy when nodes degree follow a Zipf law.
Figure 5: Accuracy on the test set and backdoor set for different topologies with $n = 100 \text{ (gray) and } 150$ (black) with $f=30 \text{ and } 45 \text{ respecively}$.

Data Poisoning Attacks in Gossip Learning

TL;DR

Abstract

Data Poisoning Attacks in Gossip Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)