FedGraM: Defending Against Untargeted Attacks in Federated Learning via Embedding Gram Matrix
Di Wu, Qian Li, Heng Yang, Yong Han
TL;DR
FedGraM tackles untargeted attacks in federated learning by computing embedding Gram matrices from a server-held auxiliary dataset to infer each client's generalization capability. By pruning the top-C high-norm Gram matrices and averaging the rest, FedGraM provides a robust aggregation that remains effective under non-IID data distributions with limited server data. Empirical evaluations across CIFAR-10, SVHN, and CIFAR-100 show FedGraM outperforming state-of-the-art defenses in most scenarios, with ablations supporting the choice of C and auxiliary data. The approach is complemented by an analysis of its limitations under adaptive attacks and potential gains from combining with other robust defenses for stronger protection in practice.
Abstract
Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by sharing only their local models, ensuring data privacy. However, FL is vulnerable to untargeted attacks that aim to degrade the global model's performance on the underlying data distribution. Existing defense mechanisms attempt to improve FL's resilience against such attacks, but their effectiveness is limited in practical FL environments due to data heterogeneity. On the contrary, we aim to detect and remove the attacks to mitigate their impact. Generalization contribution plays a crucial role in distinguishing untargeted attacks. Our observations indicate that, with limited data, the divergence between embeddings representing different classes provides a better measure of generalization than direct accuracy. In light of this, we propose a novel robust aggregation method, FedGraM, designed to defend against untargeted attacks in FL. The server maintains an auxiliary dataset containing one sample per class to support aggregation. This dataset is fed to the local models to extract embeddings. Then, the server calculates the norm of the Gram Matrix of the embeddings for each local model. The norm serves as an indicator of each model's inter-class separation capability in the embedding space. FedGraM identifies and removes potentially malicious models by filtering out those with the largest norms, then averages the remaining local models to form the global model. We conduct extensive experiments to evaluate the performance of FedGraM. Our empirical results show that with limited data samples used to construct the auxiliary dataset, FedGraM achieves exceptional performance, outperforming state-of-the-art defense methods.
