Table of Contents
Fetching ...

Robust Federated Learning Mitigates Client-side Training Data Distribution Inference Attacks

Yichang Xu, Ming Yin, Minghong Fang, Neil Zhenqiang Gong

TL;DR

This work addresses the vulnerability of federated learning to client-side data distribution inference attacks by introducing InferGuard, a Byzantine-robust aggregation rule. InferGuard computes the coordinate-wise median of client updates, filters out updates that deviate beyond a threshold $\\lambda$ from the median, and aggregates only the benign updates, ensuring robustness against malicious clients while preserving model performance. The authors validate InferGuard across five datasets against ten baselines, including adaptive and membership-inference threats, demonstrating superior defense effectiveness and resilience under non-i.i.d. data. The results suggest practical applicability of InferGuard in real FL deployments, with future work aimed at providing formal robustness guarantees.

Abstract

Recent studies have revealed that federated learning (FL), once considered secure due to clients not sharing their private data with the server, is vulnerable to attacks such as client-side training data distribution inference, where a malicious client can recreate the victim's data. While various countermeasures exist, they are not practical, often assuming server access to some training data or knowledge of label distribution before the attack. In this work, we bridge the gap by proposing InferGuard, a novel Byzantine-robust aggregation rule aimed at defending against client-side training data distribution inference attacks. In our proposed InferGuard, the server first calculates the coordinate-wise median of all the model updates it receives. A client's model update is considered malicious if it significantly deviates from the computed median update. We conduct a thorough evaluation of our proposed InferGuard on five benchmark datasets and perform a comparison with ten baseline methods. The results of our experiments indicate that our defense mechanism is highly effective in protecting against client-side training data distribution inference attacks, even against strong adaptive attacks. Furthermore, our method substantially outperforms the baseline methods in various practical FL scenarios.

Robust Federated Learning Mitigates Client-side Training Data Distribution Inference Attacks

TL;DR

This work addresses the vulnerability of federated learning to client-side data distribution inference attacks by introducing InferGuard, a Byzantine-robust aggregation rule. InferGuard computes the coordinate-wise median of client updates, filters out updates that deviate beyond a threshold from the median, and aggregates only the benign updates, ensuring robustness against malicious clients while preserving model performance. The authors validate InferGuard across five datasets against ten baselines, including adaptive and membership-inference threats, demonstrating superior defense effectiveness and resilience under non-i.i.d. data. The results suggest practical applicability of InferGuard in real FL deployments, with future work aimed at providing formal robustness guarantees.

Abstract

Recent studies have revealed that federated learning (FL), once considered secure due to clients not sharing their private data with the server, is vulnerable to attacks such as client-side training data distribution inference, where a malicious client can recreate the victim's data. While various countermeasures exist, they are not practical, often assuming server access to some training data or knowledge of label distribution before the attack. In this work, we bridge the gap by proposing InferGuard, a novel Byzantine-robust aggregation rule aimed at defending against client-side training data distribution inference attacks. In our proposed InferGuard, the server first calculates the coordinate-wise median of all the model updates it receives. A client's model update is considered malicious if it significantly deviates from the computed median update. We conduct a thorough evaluation of our proposed InferGuard on five benchmark datasets and perform a comparison with ten baseline methods. The results of our experiments indicate that our defense mechanism is highly effective in protecting against client-side training data distribution inference attacks, even against strong adaptive attacks. Furthermore, our method substantially outperforms the baseline methods in various practical FL scenarios.
Paper Structure (11 sections, 2 equations, 3 figures, 4 tables)

This paper contains 11 sections, 2 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: MNIST dataset, whether a malicious model update is chosen in each round.
  • Figure 2: Reconstruction samples on MNIST dataset.
  • Figure 3: Impact of different $\lambda$ on MNIST dataset.