FAT: Federated Adversarial Training

Giulio Zizzo; Ambrish Rawat; Mathieu Sinn; Beat Buesser

FAT: Federated Adversarial Training

Giulio Zizzo, Ambrish Rawat, Mathieu Sinn, Beat Buesser

TL;DR

The paper proposes Federated Adversarial Training (FAT) to fuse adversarial robustness with federated learning, framing the local updates as a min–max optimization and analyzing aggregation at the server. It evaluates FAT on standard datasets, investigates practical training stability (especially on FE-MNIST) and the interplay with Byzantine defenses, and introduces attacks that reveal weaknesses in common defenses like Krum, Trimmed Mean, and Bulyan. Key findings show FAT can achieve meaningful robustness in idealized FL but is highly sensitive to hyperparameters, and that only Bulyan offers robust defense under Byzantine threats, while gradient-masking attacks can mislead defenders against Krum. The results highlight critical open challenges for deploying FAT in realistic, non-IID, and adversarially-contaminated federated environments, motivating further systematic hyperparameter tuning and robust defense design.

Abstract

Federated learning (FL) is one of the most important paradigms addressing privacy and data governance issues in machine learning (ML). Adversarial training has emerged, so far, as the most promising approach against evasion threats on ML models. In this paper, we take the first known steps towards federated adversarial training (FAT) combining both methods to reduce the threat of evasion during inference while preserving the data privacy during training. We investigate the effectiveness of the FAT protocol for idealised federated settings using MNIST, Fashion-MNIST, and CIFAR10, and provide first insights on stabilising the training on the LEAF benchmark dataset which specifically emulates a federated learning environment. We identify challenges with this natural extension of adversarial training with regards to achieved adversarial robustness and further examine the idealised settings in the presence of clients undermining model convergence. We find that Trimmed Mean and Bulyan defences can be compromised and we were able to subvert Krum with a novel distillation based attack which presents an apparently "robust" model to the defender while in fact the model fails to provide robustness against simple attack modifications.

FAT: Federated Adversarial Training

TL;DR

Abstract

FAT: Federated Adversarial Training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)