Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks
Vedant Palit
TL;DR
Federated learning faces poisoning and backdoor risks under partial observability. The authors propose a trust-aware Deep Q-Network defense framed as a partially observable Markov decision process, integrating multi-signal anomaly evidence with Bayesian belief tracking to adaptively weight client updates. The approach outperforms static defenses and other RL baselines on CIFAR-10, with accuracy improving while attack resistance remains controlled, aided by sequential belief updates that stabilize trust decisions. This work demonstrates a practical, reproducible method for robust FL defenses that leverages temporal evidence and partial observability to defend against adaptive adversaries.
Abstract
Federated learning is vulnerable to poisoning and backdoor attacks under partial observability. We formulate defence as a partially observable sequential decision problem and introduce a trust-aware Deep Q-Network that integrates multi-signal evidence into client trust updates while optimizing a long-horizon robustness--accuracy objective. On CIFAR-10, we (i) establish a baseline showing steadily improving accuracy, (ii) show through a Dirichlet sweep that increased client overlap consistently improves accuracy and reduces ASR with stable detection, and (iii) demonstrate in a signal-budget study that accuracy remains steady while ASR increases and ROC-AUC declines as observability is reduced, which highlights that sequential belief updates mitigate weaker signals. Finally, a comparison with random, linear-Q, and policy gradient controllers confirms that DQN achieves the best robustness--accuracy trade-off.
