Table of Contents
Fetching ...

SLVR: Securely Leveraging Client Validation for Robust Federated Learning

Jihye Choi, Sai Rahul Rachuri, Ke Wang, Somesh Jha, Yizhen Wang

TL;DR

Secure federated learning preserves privacy but introduces a privacy-robustness tradeoff. SLVR uses secure multi-party computation to securely leverage private client data for cross-client validation, removing the need for public validation data and enabling richer robustness predicates. It combines a cross-client check procedure with a secure aggregation protocol to weight or discard updates while preserving privacy, and it adapts to distribution shifts by refreshing validation data securely. Empirically, SLVR improves robustness against adaptive poisoning attacks (up to approximately $50\%$ gains on CIFAR-10) and demonstrates stable convergence under distribution shifts, with MPC overhead that remains practical and parallelizable.

Abstract

Federated Learning (FL) enables collaborative model training while keeping client data private. However, exposing individual client updates makes FL vulnerable to reconstruction attacks. Secure aggregation mitigates such privacy risks but prevents the server from verifying the validity of each client update, creating a privacy-robustness tradeoff. Recent efforts attempt to address this tradeoff by enforcing checks on client updates using zero-knowledge proofs, but they support limited predicates and often depend on public validation data. We propose SLVR, a general framework that securely leverages clients' private data through secure multi-party computation. By utilizing clients' data, SLVR not only eliminates the need for public validation data, but also enables a wider range of checks for robustness, including cross-client accuracy validation. It also adapts naturally to distribution shifts in client data as it can securely refresh its validation data up-to-date. Our empirical evaluations show that SLVR improves robustness against model poisoning attacks, particularly outperforming existing methods by up to 50% under adaptive attacks. Additionally, SLVR demonstrates effective adaptability and stable convergence under various distribution shift scenarios.

SLVR: Securely Leveraging Client Validation for Robust Federated Learning

TL;DR

Secure federated learning preserves privacy but introduces a privacy-robustness tradeoff. SLVR uses secure multi-party computation to securely leverage private client data for cross-client validation, removing the need for public validation data and enabling richer robustness predicates. It combines a cross-client check procedure with a secure aggregation protocol to weight or discard updates while preserving privacy, and it adapts to distribution shifts by refreshing validation data securely. Empirically, SLVR improves robustness against adaptive poisoning attacks (up to approximately gains on CIFAR-10) and demonstrates stable convergence under distribution shifts, with MPC overhead that remains practical and parallelizable.

Abstract

Federated Learning (FL) enables collaborative model training while keeping client data private. However, exposing individual client updates makes FL vulnerable to reconstruction attacks. Secure aggregation mitigates such privacy risks but prevents the server from verifying the validity of each client update, creating a privacy-robustness tradeoff. Recent efforts attempt to address this tradeoff by enforcing checks on client updates using zero-knowledge proofs, but they support limited predicates and often depend on public validation data. We propose SLVR, a general framework that securely leverages clients' private data through secure multi-party computation. By utilizing clients' data, SLVR not only eliminates the need for public validation data, but also enables a wider range of checks for robustness, including cross-client accuracy validation. It also adapts naturally to distribution shifts in client data as it can securely refresh its validation data up-to-date. Our empirical evaluations show that SLVR improves robustness against model poisoning attacks, particularly outperforming existing methods by up to 50% under adaptive attacks. Additionally, SLVR demonstrates effective adaptability and stable convergence under various distribution shift scenarios.

Paper Structure

This paper contains 27 sections, 4 theorems, 11 equations, 11 figures, 4 tables.

Key Result

Theorem 1

Protocol $\Pi_{\mathsf{Sec - Agg}}$ securely realises the functionality $\mathcal{F}_{\mathsf{Sec - Agg}}$ in the presence of a malicious adversary that can statically corrupt up to $m_c < m/2$ parties in the protocol, in the $(\mathcal{F}_{\mathsf{Sec - Inf}}, \mathcal{F}_{\mathsf{Sort}}, \mathcal{

Figures (11)

  • Figure 1: Previous methods vs. SLVR. Secure aggregation (\ref{['fig:a_secure']}) lacks the means to check model integrity privately. Recent byzantine-robust secure aggregation schemes enable integrity checks with zero-knowledge proofs (\ref{['fig:b_robust_with_publicval']}), but only support simple predicate ( e.g., $\ell_2$ norm of client update), and may rely on public datasets. SLVR securely leverages clients' local data for integrity checks via MPC (\ref{['fig:c_ours']}), supporting more powerful predicates without requiring public data.
  • Figure 2: Protocol for Robust Secure Aggregation
  • Figure 3: Protocol for Robustness Check
  • Figure 4: Robustness against Adaptive Attack with 20% malicious clients. Both SLVR (acc) and SLVR (prob) demonstrate competitive convergence performance against adaptive attacks. In contrast, other baselines that rely on static public validation datasets ( e.g., Norm Bound ($\mathcal{D}_{val}$), Norm Ball, Cosine Similarity) or those that are adaptive but rely on simple validation checks ( e.g., Norm Bound (adaptive)) become vulnerable.
  • Figure 5: Adaptability across the two distribution shift scenarios. We start with 100 clients working on MNIST data. Every 100 (or 250) rounds (marked by dotted lines), either 20 random clients transition to SVHN data (\ref{['subfig:scenario-evolving']}), or an additional 20 clients with SVHN data join the communication (\ref{['subfig:scenario-new']}). Our model consistently adapts and improves accuracy throughout the communication rounds, outperforming other aggregation protocols that struggle to adjust to distribution shifts.
  • ...and 6 more figures

Theorems & Definitions (9)

  • Theorem 1
  • Definition 2: Extreme Check Score Manipulation
  • Theorem 3
  • proof
  • Theorem 4
  • proof
  • Definition 5: Extreme Check Score Manipulation
  • Theorem 6
  • proof