A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection
Qinbin Li, Zeyi Wen, Zhaomin Wu, Sixu Hu, Naibo Wang, Yuan Li, Xu Liu, Bingsheng He
TL;DR
This paper addresses the challenge of making federated learning practically usable at scale by adopting a system-centric view. It introduces a six-aspect taxonomy (data partitioning, ML models, privacy mechanisms, communication architecture, federation scale, and motivation) to categorize existing FLSs and guide design choices. Through a survey of methodologies, case studies, open-source platforms, and design principles, it highlights current limitations in effectiveness, efficiency, and privacy, and outlines future directions including benchmarks, decentralization, and incentive mechanisms. The work aims to equip developers and researchers with concrete system abstractions and evaluation guidance to build robust, privacy-preserving federated learning systems that can operate across domains and scales.
Abstract
Federated learning has been a hot research topic in enabling the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, there is a requirement in developing systems and infrastructures to ease the development of various federated learning algorithms. Similar to deep learning systems such as PyTorch and TensorFlow that boost the development of deep learning, federated learning systems (FLSs) are equivalently important, and face challenges from various aspects such as effectiveness, efficiency, and privacy. In this survey, we conduct a comprehensive review on federated learning systems. To achieve smooth flow and guide future research, we introduce the definition of federated learning systems and analyze the system components. Moreover, we provide a thorough categorization for federated learning systems according to six different aspects, including data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation and motivation of federation. The categorization can help the design of federated learning systems as shown in our case studies. By systematically summarizing the existing federated learning systems, we present the design factors, case studies, and future research opportunities.
