Table of Contents
Fetching ...

Mitigating Data Injection Attacks on Federated Learning

Or Shalom, Amir Leshem, Waheed U. Bajwa

TL;DR

The authors address data injection attacks in federated learning by proposing a lightweight, gradient-based detector that operates during training. The coordinator compares each edge agent's update to the coordinatewise median, applying a g(t)-weighted model where suspected attackers are temporarily ignored for a window $\Delta T$, with decisions aggregated over $K$ intervals to ensure robustness under a majority of truthful agents. They prove that, under i.i.d. data and majority truth-tellers, attackers are eliminated with probability $1$ after finite time while trustworthy updates persist, and they demonstrate practical effectiveness via MNIST-based simulations against constant-output and label-flipping attacks. The approach offers provable resilience and is compatible with the convergence dynamics of federated learning, enabling safer deployment in adversarial environments.

Abstract

Federated learning is a technique that allows multiple entities to collaboratively train models using their data without compromising data privacy. However, despite its advantages, federated learning can be susceptible to false data injection attacks. In these scenarios, a malicious entity with control over specific agents in the network can manipulate the learning process, leading to a suboptimal model. Consequently, addressing these data injection attacks presents a significant research challenge in federated learning systems. In this paper, we propose a novel technique to detect and mitigate data injection attacks on federated learning systems. Our mitigation method is a local scheme, performed during a single instance of training by the coordinating node, allowing the mitigation during the convergence of the algorithm. Whenever an agent is suspected to be an attacker, its data will be ignored for a certain period, this decision will often be re-evaluated. We prove that with probability 1, after a finite time, all attackers will be ignored while the probability of ignoring a trustful agent becomes 0, provided that there is a majority of truthful agents. Simulations show that when the coordinating node detects and isolates all the attackers, the model recovers and converges to the truthful model.

Mitigating Data Injection Attacks on Federated Learning

TL;DR

The authors address data injection attacks in federated learning by proposing a lightweight, gradient-based detector that operates during training. The coordinator compares each edge agent's update to the coordinatewise median, applying a g(t)-weighted model where suspected attackers are temporarily ignored for a window , with decisions aggregated over intervals to ensure robustness under a majority of truthful agents. They prove that, under i.i.d. data and majority truth-tellers, attackers are eliminated with probability after finite time while trustworthy updates persist, and they demonstrate practical effectiveness via MNIST-based simulations against constant-output and label-flipping attacks. The approach offers provable resilience and is compatible with the convergence dynamics of federated learning, enabling safer deployment in adversarial environments.

Abstract

Federated learning is a technique that allows multiple entities to collaboratively train models using their data without compromising data privacy. However, despite its advantages, federated learning can be susceptible to false data injection attacks. In these scenarios, a malicious entity with control over specific agents in the network can manipulate the learning process, leading to a suboptimal model. Consequently, addressing these data injection attacks presents a significant research challenge in federated learning systems. In this paper, we propose a novel technique to detect and mitigate data injection attacks on federated learning systems. Our mitigation method is a local scheme, performed during a single instance of training by the coordinating node, allowing the mitigation during the convergence of the algorithm. Whenever an agent is suspected to be an attacker, its data will be ignored for a certain period, this decision will often be re-evaluated. We prove that with probability 1, after a finite time, all attackers will be ignored while the probability of ignoring a trustful agent becomes 0, provided that there is a majority of truthful agents. Simulations show that when the coordinating node detects and isolates all the attackers, the model recovers and converges to the truthful model.
Paper Structure (9 sections, 2 theorems, 7 equations, 4 figures)

This paper contains 9 sections, 2 theorems, 7 equations, 4 figures.

Key Result

Lemma 1

Assume that the majority of agents are trustworthy. Furthermore, assume that data is sub-Gaussian and i.i.d. between agents and classes. There are values $\delta_u$ and $\Delta T$ for which $P_{FA}(I_k)<\frac{1}{2}<P_D(I_k)$ when detection is based on consecutive $\Delta T$ model updates, where $P_{

Figures (4)

  • Figure 1: A simple federated learning system. Each agent performs training on its private dataset; the local updates are then transmitted to a coordinating node, which returns a global model. Some of the agents may be malicious, meaning they might send unreliable updates to the coordinating node.
  • Figure 2: Assessing Constant-Output Attack and Detection in Federated Learning: (a) Classification error under a single attacker across different truthful agent ratios, highlighting the attack's success without detection. (b) Effectiveness of the detection scheme, emphasizing network resilience and model accuracy against the attack.
  • Figure 3: Comparative Analysis of Constant-Output Attack: Showcasing $100$ experiments on a $N=5$ agent network, this figure highlights the model's classification error with and without the detection scheme [(a) and (b)]. Included are $10\%$ and $90\%$ confidence bounds, underscoring the attack's effect and the detection's efficacy.
  • Figure 4: Evaluating Label-Flipping Attack Impact: Plots (a) and (b) present the outcomes of 100 experiments without detection, showing classification error and label-flip success rate, respectively. (c) Classification error with detection implemented, offering a comparative perspective. All plots include 10% and 90% confidence bounds.

Theorems & Definitions (2)

  • Lemma 1
  • Lemma 2