FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning
Abolfazl Younesi, Leon Kiss, Zahra Najafabadi Samani, Juan Aznar Poveda, Thomas Fahringer
TL;DR
FLARE addresses the fragility of static, binary trust mechanisms in federated learning by introducing a dynamic, multi-dimensional reputation framework that continuously evaluates client reliability across performance, statistical, and temporal dimensions. It employs an adaptive threshold to adjust security rigor to the model's convergence state and recent attack intensity, and uses reputation-weighted aggregation with soft exclusion to balance robustness and participation, all while preserving privacy via Local Differential Privacy. A Statistical Mimicry attack benchmark (SM) tests the framework's resilience, and extensive experiments on MNIST, CIFAR-10, and SVHN with 100 clients demonstrate that FLARE maintains higher accuracy and faster convergence than state-of-the-art defenses under a range of attacks, including adaptive and evasive strategies. The results indicate that FLARE achieves strong malicious-client detection with low overhead and remains effective across varying data heterogeneity and attack intensities, making it practical for real-world deployments.
Abstract
Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who compromise model integrity through Byzantine attacks, data poisoning, or adaptive adversarial behaviors. Existing defense mechanisms rely on static thresholds and binary classification, failing to adapt to evolving client behaviors in real-world deployments. We propose FLARE, an adaptive reputation-based framework that transforms client reliability assessment from binary decisions to a continuous, multi-dimensional trust evaluation. FLARE integrates: (i) a multi-dimensional reputation score capturing performance consistency, statistical anomaly indicators, and temporal behavior, (ii) a self-calibrating adaptive threshold mechanism that adjusts security strictness based on model convergence and recent attack intensity, (iii) reputation-weighted aggregation with soft exclusion to proportionally limit suspicious contributions rather than eliminating clients outright, and (iv) a Local Differential Privacy (LDP) mechanism enabling reputation scoring on privatized client updates. We further introduce a highly evasive Statistical Mimicry (SM) attack, a benchmark adversary that blends honest gradients with synthetic perturbations and persistent drift to remain undetected by traditional filters. Extensive experiments with 100 clients on MNIST, CIFAR-10, and SVHN demonstrate that FLARE maintains high model accuracy and converges faster than state-of-the-art Byzantine-robust methods under diverse attack types, including label flipping, gradient scaling, adaptive attacks, ALIE, and SM. FLARE improves robustness by up to 16% and preserves model convergence within 30% of the non-attacked baseline, while achieving strong malicious-client detection performance with minimal computational overhead. https://github.com/Anonymous0-0paper/FLARE
