Table of Contents
Fetching ...

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

Yanli Li, Yanan Zhou, Zhongliang Guo, Nan Yang, Yuning Zhang, Huaming Chen, Dong Yuan, Weiping Ding, Witold Pedrycz

TL;DR

This paper addresses the vulnerability of federated learning to dual-objective attacks that degrade both accuracy and group fairness. It introduces Dual-Facet Attack (DFA) with two variants (S-DFA and Sp-DFA) and a defense framework, GuardFed, that builds a fairness-aware reference model from synthetic server data generated via a Gaussian Copula and uses a dual-perspective trust score to selectively aggregate client updates. The authors provide a comprehensive experimental evaluation on COMPAS and ADULT datasets under varying non-IID conditions, showing that DFA can significantly degrade existing robust and fairness-aware FL methods, while GuardFed achieves state-of-the-art performance in both accuracy and fairness, even under strong adversarial pressure. The results demonstrate GuardFed’s practical potential for trustworthy and fair federated learning without requiring large-scale server data or stringent trust assumptions.

Abstract

Federated learning (FL) enables privacy-preserving collaborative model training but remains vulnerable to adversarial behaviors that compromise model utility or fairness across sensitive groups. While extensive studies have examined attacks targeting either objective, strategies that simultaneously degrade both utility and fairness remain largely unexplored. To bridge this gap, we introduce the Dual-Facet Attack (DFA), a novel threat model that concurrently undermines predictive accuracy and group fairness. Two variants, Synchronous DFA (S-DFA) and Split DFA (Sp-DFA), are further proposed to capture distinct real-world collusion scenarios. Experimental results show that existing robust FL defenses, including hybrid aggregation schemes, fail to resist DFAs effectively. To counter these threats, we propose GuardFed, a self-adaptive defense framework that maintains a fairness-aware reference model using a small amount of clean server data augmented with synthetic samples. In each training round, GuardFed computes a dual-perspective trust score for every client by jointly evaluating its utility deviation and fairness degradation, thereby enabling selective aggregation of trustworthy updates. Extensive experiments on real-world datasets demonstrate that GuardFed consistently preserves both accuracy and fairness under diverse non-IID and adversarial conditions, achieving state-of-the-art performance compared with existing robust FL methods.

GuardFed: A Trustworthy Federated Learning Framework Against Dual-Facet Attacks

TL;DR

This paper addresses the vulnerability of federated learning to dual-objective attacks that degrade both accuracy and group fairness. It introduces Dual-Facet Attack (DFA) with two variants (S-DFA and Sp-DFA) and a defense framework, GuardFed, that builds a fairness-aware reference model from synthetic server data generated via a Gaussian Copula and uses a dual-perspective trust score to selectively aggregate client updates. The authors provide a comprehensive experimental evaluation on COMPAS and ADULT datasets under varying non-IID conditions, showing that DFA can significantly degrade existing robust and fairness-aware FL methods, while GuardFed achieves state-of-the-art performance in both accuracy and fairness, even under strong adversarial pressure. The results demonstrate GuardFed’s practical potential for trustworthy and fair federated learning without requiring large-scale server data or stringent trust assumptions.

Abstract

Federated learning (FL) enables privacy-preserving collaborative model training but remains vulnerable to adversarial behaviors that compromise model utility or fairness across sensitive groups. While extensive studies have examined attacks targeting either objective, strategies that simultaneously degrade both utility and fairness remain largely unexplored. To bridge this gap, we introduce the Dual-Facet Attack (DFA), a novel threat model that concurrently undermines predictive accuracy and group fairness. Two variants, Synchronous DFA (S-DFA) and Split DFA (Sp-DFA), are further proposed to capture distinct real-world collusion scenarios. Experimental results show that existing robust FL defenses, including hybrid aggregation schemes, fail to resist DFAs effectively. To counter these threats, we propose GuardFed, a self-adaptive defense framework that maintains a fairness-aware reference model using a small amount of clean server data augmented with synthetic samples. In each training round, GuardFed computes a dual-perspective trust score for every client by jointly evaluating its utility deviation and fairness degradation, thereby enabling selective aggregation of trustworthy updates. Extensive experiments on real-world datasets demonstrate that GuardFed consistently preserves both accuracy and fairness under diverse non-IID and adversarial conditions, achieving state-of-the-art performance compared with existing robust FL methods.

Paper Structure

This paper contains 18 sections, 18 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Overview of the GuardFed system. All participants, including potential attackers, perform local model training and upload their updates. The server applies a self-adaptive aggregation mechanism to ensure both fairness and robustness in the global model.
  • Figure 2: Learning performance of fairness-enhanced, robust, and hybrid FL algorithms under benign and Dual-Facet Attack (DFA) scenarios in IID and non-IID settings on the COMPAS dataset.
  • Figure 3: Fairness index (AEOD and ASPD) of fairness-enhanced, robust, and hybrid FL algorithms under benign and Dual-Facet Attack (DFA) scenarios on the COMPAS dataset, evaluated in both IID and non-IID settings. Underlines indicate models whose test accuracy falls below the predefined threshold, rendering the fairness metrics unreliable.
  • Figure 4: Illustration of model accuracy and AEOD under different hyperparameters and adversarial settings

Theorems & Definitions (4)

  • Definition 1: Absolute Statistical Parity Difference (ASPD) zemel2013learning
  • Definition 2: Absolute Equal Opportunity Difference (AEOD) zemel2013learning
  • Definition 3: Synchronous Dual-Facet Attack (S-DFA)
  • Definition 4: Split Dual-Facet Attack (Sp-DFA)