Towards Open Federated Learning Platforms: Survey and Vision from Technical and Legal Perspectives
Moming Duan, Qinbin Li, Linshan Jiang, Bingsheng He
TL;DR
This paper argues that server-dominated Federated Learning constrains broad participation and model reuse, and it proposes two open-platform paradigms—query-based FL and contract-based FL—to realize a crowdsourced collaborative ML ecosystem. It develops a model-licensing taxonomy and analyzes legal, technical, and IP considerations for batch model reuse, providing guidelines for license selection and protection of intellectual property. The work also surveys how existing model repositories and KD/generation techniques can support open FL, and it outlines ML microtask designs and platform architectures for both reciprocal frameworks. By outlining remaining challenges and possible pathways, the paper envisions practical open FL platforms that enable AI access and collaboration for a wide range of Internet users.
Abstract
Traditional Federated Learning (FL) follows a server-dominated cooperation paradigm which narrows the application scenarios of FL and decreases the enthusiasm of data holders to participate. To fully unleash the potential of FL, we advocate rethinking the design of current FL frameworks and extending it to a more generalized concept: Open Federated Learning Platforms, positioned as a crowdsourcing collaborative machine learning infrastructure for all Internet users. We propose two reciprocal cooperation frameworks to achieve this: query-based FL and contract-based FL. In this survey, we conduct a comprehensive review of the feasibility of constructing open FL platforms from both technical and legal perspectives. We begin by reviewing the definition of FL and summarizing its inherent limitations, including server-client coupling, low model reusability, and non-public. In particular, we introduce a novel taxonomy to streamline the analysis of model license compatibility in FL studies that involve batch model reusing methods, including combination, amalgamation, distillation, and generation. This taxonomy provides a feasible solution for identifying the corresponding licenses clauses and facilitates the analysis of potential legal implications and restrictions when reusing models. Through this survey, we uncover the current dilemmas faced by FL and advocate for the development of sustainable open FL platforms. We aim to provide guidance for establishing such platforms in the future while identifying potential limitations that need to be addressed.
