Table of Contents
Fetching ...

Zero-Knowledge Federated Learning: A New Trustworthy and Privacy-Preserving Distributed Learning Paradigm

Taotao Wang, Yuxin Jin, Qing Yang, Yihan Xia, Long Shi, Shengli Zhang

TL;DR

The paper addresses security and trust gaps in Federated Learning by introducing a structured ZK-FL framework and a Veri-CS-FL algorithm that leverages Zero-Knowledge Proofs to verify client-side metrics. It proposes a cosine-similarity-based evaluation against a server-trained benchmark and uses zk-SNARK proofs to ensure metric correctness without exposing local data, enabling verifiable client selection and secure aggregation. The approach demonstrates completeness, soundness, and zero-knowledge properties, while showing resistance to poisoning and robustness to data heterogeneity. Empirically, Veri-CS-FL achieves bandwidth savings and scalable verification on IID and non-IID data, suggesting practical viability for privacy-preserving, trustworthy distributed learning at scale.

Abstract

Federated Learning (FL) has emerged as a promising paradigm in distributed machine learning, enabling collaborative model training while preserving data privacy. However, despite its many advantages, FL still contends with significant challenges -- most notably regarding security and trust. Zero-Knowledge Proofs (ZKPs) offer a potential solution by establishing trust and enhancing system integrity throughout the FL process. Although several studies have explored ZKP-based FL (ZK-FL), a systematic framework and comprehensive analysis are still lacking. This article makes two key contributions. First, we propose a structured ZK-FL framework that categorizes and analyzes the technical roles of ZKPs across various FL stages and tasks. Second, we introduce a novel algorithm, Verifiable Client Selection FL (Veri-CS-FL), which employs ZKPs to refine the client selection process. In Veri-CS-FL, participating clients generate verifiable proofs for the performance metrics of their local models and submit these concise proofs to the server for efficient verification. The server then selects clients with high-quality local models for uploading, subsequently aggregating the contributions from these selected clients. By integrating ZKPs, Veri-CS-FL not only ensures the accuracy of performance metrics but also fortifies trust among participants while enhancing the overall efficiency and security of FL systems.

Zero-Knowledge Federated Learning: A New Trustworthy and Privacy-Preserving Distributed Learning Paradigm

TL;DR

The paper addresses security and trust gaps in Federated Learning by introducing a structured ZK-FL framework and a Veri-CS-FL algorithm that leverages Zero-Knowledge Proofs to verify client-side metrics. It proposes a cosine-similarity-based evaluation against a server-trained benchmark and uses zk-SNARK proofs to ensure metric correctness without exposing local data, enabling verifiable client selection and secure aggregation. The approach demonstrates completeness, soundness, and zero-knowledge properties, while showing resistance to poisoning and robustness to data heterogeneity. Empirically, Veri-CS-FL achieves bandwidth savings and scalable verification on IID and non-IID data, suggesting practical viability for privacy-preserving, trustworthy distributed learning at scale.

Abstract

Federated Learning (FL) has emerged as a promising paradigm in distributed machine learning, enabling collaborative model training while preserving data privacy. However, despite its many advantages, FL still contends with significant challenges -- most notably regarding security and trust. Zero-Knowledge Proofs (ZKPs) offer a potential solution by establishing trust and enhancing system integrity throughout the FL process. Although several studies have explored ZKP-based FL (ZK-FL), a systematic framework and comprehensive analysis are still lacking. This article makes two key contributions. First, we propose a structured ZK-FL framework that categorizes and analyzes the technical roles of ZKPs across various FL stages and tasks. Second, we introduce a novel algorithm, Verifiable Client Selection FL (Veri-CS-FL), which employs ZKPs to refine the client selection process. In Veri-CS-FL, participating clients generate verifiable proofs for the performance metrics of their local models and submit these concise proofs to the server for efficient verification. The server then selects clients with high-quality local models for uploading, subsequently aggregating the contributions from these selected clients. By integrating ZKPs, Veri-CS-FL not only ensures the accuracy of performance metrics but also fortifies trust among participants while enhancing the overall efficiency and security of FL systems.

Paper Structure

This paper contains 25 sections, 2 figures, 1 table, 1 algorithm.

Figures (2)

  • Figure 1: The structured framework of ZK-FL demonstrating that ZKP can be integrated at various stages of FL to address different trust issues in FL: Local Model Training Stage: ① client identity trust, ② local data trust, and ③ local training trust; Global Model Aggregation Stage: ④ model evaluation trust, and ⑤ model aggregation trust.
  • Figure 2: Experimental results of Veri-CS-FL: (a) Proof size with respect to model size; (b) Verification time and proof generation time with respect to model size; (c) Global model accuracy when aggregating local models from different numbers of clients; (d) Global model accuracy when using performance-based client selection (Veri-CS-FL) and random client selection (Rand-CS-FL).