Zero-Knowledge Federated Learning: A New Trustworthy and Privacy-Preserving Distributed Learning Paradigm
Taotao Wang, Yuxin Jin, Qing Yang, Yihan Xia, Long Shi, Shengli Zhang
TL;DR
The paper addresses security and trust gaps in Federated Learning by introducing a structured ZK-FL framework and a Veri-CS-FL algorithm that leverages Zero-Knowledge Proofs to verify client-side metrics. It proposes a cosine-similarity-based evaluation against a server-trained benchmark and uses zk-SNARK proofs to ensure metric correctness without exposing local data, enabling verifiable client selection and secure aggregation. The approach demonstrates completeness, soundness, and zero-knowledge properties, while showing resistance to poisoning and robustness to data heterogeneity. Empirically, Veri-CS-FL achieves bandwidth savings and scalable verification on IID and non-IID data, suggesting practical viability for privacy-preserving, trustworthy distributed learning at scale.
Abstract
Federated Learning (FL) has emerged as a promising paradigm in distributed machine learning, enabling collaborative model training while preserving data privacy. However, despite its many advantages, FL still contends with significant challenges -- most notably regarding security and trust. Zero-Knowledge Proofs (ZKPs) offer a potential solution by establishing trust and enhancing system integrity throughout the FL process. Although several studies have explored ZKP-based FL (ZK-FL), a systematic framework and comprehensive analysis are still lacking. This article makes two key contributions. First, we propose a structured ZK-FL framework that categorizes and analyzes the technical roles of ZKPs across various FL stages and tasks. Second, we introduce a novel algorithm, Verifiable Client Selection FL (Veri-CS-FL), which employs ZKPs to refine the client selection process. In Veri-CS-FL, participating clients generate verifiable proofs for the performance metrics of their local models and submit these concise proofs to the server for efficient verification. The server then selects clients with high-quality local models for uploading, subsequently aggregating the contributions from these selected clients. By integrating ZKPs, Veri-CS-FL not only ensures the accuracy of performance metrics but also fortifies trust among participants while enhancing the overall efficiency and security of FL systems.
