Mitigating Data Injection Attacks on Federated Learning
Or Shalom, Amir Leshem, Waheed U. Bajwa
TL;DR
The authors address data injection attacks in federated learning by proposing a lightweight, gradient-based detector that operates during training. The coordinator compares each edge agent's update to the coordinatewise median, applying a g(t)-weighted model where suspected attackers are temporarily ignored for a window $\Delta T$, with decisions aggregated over $K$ intervals to ensure robustness under a majority of truthful agents. They prove that, under i.i.d. data and majority truth-tellers, attackers are eliminated with probability $1$ after finite time while trustworthy updates persist, and they demonstrate practical effectiveness via MNIST-based simulations against constant-output and label-flipping attacks. The approach offers provable resilience and is compatible with the convergence dynamics of federated learning, enabling safer deployment in adversarial environments.
Abstract
Federated learning is a technique that allows multiple entities to collaboratively train models using their data without compromising data privacy. However, despite its advantages, federated learning can be susceptible to false data injection attacks. In these scenarios, a malicious entity with control over specific agents in the network can manipulate the learning process, leading to a suboptimal model. Consequently, addressing these data injection attacks presents a significant research challenge in federated learning systems. In this paper, we propose a novel technique to detect and mitigate data injection attacks on federated learning systems. Our mitigation method is a local scheme, performed during a single instance of training by the coordinating node, allowing the mitigation during the convergence of the algorithm. Whenever an agent is suspected to be an attacker, its data will be ignored for a certain period, this decision will often be re-evaluated. We prove that with probability 1, after a finite time, all attackers will be ignored while the probability of ignoring a trustful agent becomes 0, provided that there is a majority of truthful agents. Simulations show that when the coordinating node detects and isolates all the attackers, the model recovers and converges to the truthful model.
