Federated Semi-Supervised and Semi-Asynchronous Learning for Anomaly Detection in IoT Networks
Wenbin Zhai, Liang Liu, Feng Wang, Youwei Ding, Wanying Lu, Weizhi Meng
TL;DR
The paper tackles anomaly detection in IoT networks under practical constraints of unlabeled client data, non-IID distributions, and limited communication. It introduces FedS3A, a framework that combines federated semi-supervised learning (pseudo-labeling at clients with server-side supervision), semi-asynchronous updates, and a staleness-tolerant distribution strategy, augmented by group-based aggregation and adaptive learning rates. Empirical results on CIC-IDS 2017 show FedS3A achieving over 98% accuracy with more than 50% reduction in communication cost, outperforming baseline FL approaches in both detection performance and round efficiency. The work presents a practical, robust approach for scalable IoT anomaly detection in heterogeneous, resource-constrained environments.
Abstract
Existing FL-based approaches are based on the unrealistic assumption that the data on the client-side is fully annotated with ground truths. Furthermore, it is a great challenge how to improve the training efficiency while ensuring the detection accuracy in the highly heterogeneous and resource-constrained IoT networks. Meanwhile, the communication cost between clients and the server is also a problem that can not be ignored. Therefore, in this paper, we propose a Federated Semi-Supervised and Semi-Asynchronous (FedS3A) learning for anomaly detection in IoT networks. First, we consider a more realistic assumption that labeled data is only available at the server, and pseudo-labeling is utilized to implement federated semi-supervised learning, in which a dynamic weight of supervised learning is exploited to balance the supervised learning at the server and unsupervised learning at clients. Then, we propose a semi-asynchronous model update and staleness tolerant distribution scheme to achieve a trade-off between the round efficiency and detection accuracy. Meanwhile, the staleness of local models and the participation frequency of clients are considered to adjust their contributions to the global model. In addition, a group-based aggregation function is proposed to deal with the non-IID distribution of the data. Finally, the difference transmission based on the sparse matrix is adopted to reduce the communication cost. Extensive experimental results show that FedS3A can achieve greater than 98% accuracy even when the data is non-IID and is superior to the classic FL-based algorithms in terms of both detection performance and round efficiency, achieving a win-win situation. Meanwhile, FedS3A successfully reduces the communication cost by higher than 50%.
