Table of Contents
Fetching ...

Secure Shapley Value for Cross-Silo Federated Learning (Technical Report)

Shuyuan Zheng, Yang Cao, Masatoshi Yoshikawa

TL;DR

This work addresses the privacy challenges of computing Shapley values for cross-silo federated learning by introducing two protocols: HESV, a purely HE-based one-server solution, and SecSV, a more efficient two-server protocol that combines HE for models with additive secret sharing for test data. The key innovations—Hybrid Secure Testing, Matrix Reducing, and SampleSkip—enable secure evaluation of SV with dramatically reduced runtime while preserving accuracy. Theory and experiments demonstrate substantial speedups (7.2–36.6×) over HESV across diverse tasks, with controlled SV estimation error and robust security under reasonable assumptions. The results advance trustworthy data valuation in collaborative ML, enabling fair client contributions without exposing sensitive data or models at scale.

Abstract

The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV, a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV, an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext--ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2-36.6 times as fast as HESV, with a limited loss in the accuracy of calculated SVs.

Secure Shapley Value for Cross-Silo Federated Learning (Technical Report)

TL;DR

This work addresses the privacy challenges of computing Shapley values for cross-silo federated learning by introducing two protocols: HESV, a purely HE-based one-server solution, and SecSV, a more efficient two-server protocol that combines HE for models with additive secret sharing for test data. The key innovations—Hybrid Secure Testing, Matrix Reducing, and SampleSkip—enable secure evaluation of SV with dramatically reduced runtime while preserving accuracy. Theory and experiments demonstrate substantial speedups (7.2–36.6×) over HESV across diverse tasks, with controlled SV estimation error and robust security under reasonable assumptions. The results advance trustworthy data valuation in collaborative ML, enabling fair client contributions without exposing sensitive data or models at scale.

Abstract

The Shapley value (SV) is a fair and principled metric for contribution evaluation in cross-silo federated learning (cross-silo FL), wherein organizations, i.e., clients, collaboratively train prediction models with the coordination of a parameter server. However, existing SV calculation methods for FL assume that the server can access the raw FL models and public test data. This may not be a valid assumption in practice considering the emerging privacy attacks on FL models and the fact that test data might be clients' private assets. Hence, we investigate the problem of secure SV calculation for cross-silo FL. We first propose HESV, a one-server solution based solely on homomorphic encryption (HE) for privacy protection, which has limitations in efficiency. To overcome these limitations, we propose SecSV, an efficient two-server protocol with the following novel features. First, SecSV utilizes a hybrid privacy protection scheme to avoid ciphertext--ciphertext multiplications between test data and models, which are extremely expensive under HE. Second, an efficient secure matrix multiplication method is proposed for SecSV. Third, SecSV strategically identifies and skips some test samples without significantly affecting the evaluation accuracy. Our experiments demonstrate that SecSV is 7.2-36.6 times as fast as HESV, with a limited loss in the accuracy of calculated SVs.
Paper Structure (35 sections, 7 theorems, 7 equations, 12 figures, 8 tables, 10 algorithms)

This paper contains 35 sections, 7 theorems, 7 equations, 12 figures, 8 tables, 10 algorithms.

Key Result

lemma 1

We have $AB=\delta(A, B)$ for any $d_{out} \! \times \! d_{in}$ matrix $A$ and $d_{in} \! \times\! m$ matrix $B$, where $m\! \leq\! d_{in}$, and $d_{out} \!\leq \! d_{in}$.See Appendix appendix:proof to find all the missing proofs.

Figures (12)

  • Figure 1: Secure federated training.
  • Figure 2: Secure SV calculation.
  • Figure 4: SOTA method for matrix multiplication.
  • Figure 5: Secure testing for SecSV.
  • Figure 6: Matrix Reducing for matrix multiplication.
  • ...and 7 more figures

Theorems & Definitions (11)

  • lemma 1: Correctness
  • lemma 2: Composition
  • lemma 3
  • definition 1: Linear classifier
  • theorem 1
  • lemma 4
  • definition 2: Equation-solving attack
  • definition 3: Membership inference attack
  • definition 4: Retraining attack
  • proposition 1
  • ...and 1 more