Table of Contents
Fetching ...

Harnessing Sparsification in Federated Learning: A Secure, Efficient, and Differentially Private Realization

Shuangqing Xu, Yifeng Zheng, Zhongyun Hua

TL;DR

Clover tackles the communication bottleneck and privacy risks in federated learning by integrating standard top-k gradient sparsification with a novel secure sparse vector aggregation (SparVecAgg) implemented over a three-server replicated secret-sharing framework. It achieves differential privacy through distributed noise generation and introduces lightweight integrity checks to guard against malicious servers, including verifiable noise sampling and blind MAC-based shuffle verification. The framework delivers utility comparable to central DP while delivering substantial communication and computation savings relative to ORAM-based baselines, and it maintains robust security with only modest overhead when enabling malicious security. Overall, Clover presents a practical, scalable approach to secure, private FL that leverages sparsification for efficiency without sacrificing privacy or resilience to adversarial behavior.

Abstract

Federated learning (FL) enables multiple clients to jointly train a model by sharing only gradient updates for aggregation instead of raw data. Due to the transmission of very high-dimensional gradient updates from many clients, FL is known to suffer from a communication bottleneck. Meanwhile, the gradients shared by clients as well as the trained model may also be exploited for inferring private local datasets, making privacy still a critical concern in FL. We present Clover, a novel system framework for communication-efficient, secure, and differentially private FL. To tackle the communication bottleneck in FL, Clover follows a standard and commonly used approach-top-k gradient sparsification, where each client sparsifies its gradient update such that only k largest gradients (measured by magnitude) are preserved for aggregation. Clover provides a tailored mechanism built out of a trending distributed trust setting involving three servers, which allows to efficiently aggregate multiple sparse vectors (top-k sparsified gradient updates) into a dense vector while hiding the values and indices of non-zero elements in each sparse vector. This mechanism outperforms a baseline built on the general distributed ORAM technique by several orders of magnitude in server-side communication and runtime, with also smaller client communication cost. We further integrate this mechanism with a lightweight distributed noise generation mechanism to offer differential privacy (DP) guarantees on the trained model. To harden Clover with security against a malicious server, we devise a series of lightweight mechanisms for integrity checks on the server-side computation. Extensive experiments show that Clover can achieve utility comparable to vanilla FL with central DP, with promising performance.

Harnessing Sparsification in Federated Learning: A Secure, Efficient, and Differentially Private Realization

TL;DR

Clover tackles the communication bottleneck and privacy risks in federated learning by integrating standard top-k gradient sparsification with a novel secure sparse vector aggregation (SparVecAgg) implemented over a three-server replicated secret-sharing framework. It achieves differential privacy through distributed noise generation and introduces lightweight integrity checks to guard against malicious servers, including verifiable noise sampling and blind MAC-based shuffle verification. The framework delivers utility comparable to central DP while delivering substantial communication and computation savings relative to ORAM-based baselines, and it maintains robust security with only modest overhead when enabling malicious security. Overall, Clover presents a practical, scalable approach to secure, private FL that leverages sparsification for efficiency without sacrificing privacy or resilience to adversarial behavior.

Abstract

Federated learning (FL) enables multiple clients to jointly train a model by sharing only gradient updates for aggregation instead of raw data. Due to the transmission of very high-dimensional gradient updates from many clients, FL is known to suffer from a communication bottleneck. Meanwhile, the gradients shared by clients as well as the trained model may also be exploited for inferring private local datasets, making privacy still a critical concern in FL. We present Clover, a novel system framework for communication-efficient, secure, and differentially private FL. To tackle the communication bottleneck in FL, Clover follows a standard and commonly used approach-top-k gradient sparsification, where each client sparsifies its gradient update such that only k largest gradients (measured by magnitude) are preserved for aggregation. Clover provides a tailored mechanism built out of a trending distributed trust setting involving three servers, which allows to efficiently aggregate multiple sparse vectors (top-k sparsified gradient updates) into a dense vector while hiding the values and indices of non-zero elements in each sparse vector. This mechanism outperforms a baseline built on the general distributed ORAM technique by several orders of magnitude in server-side communication and runtime, with also smaller client communication cost. We further integrate this mechanism with a lightweight distributed noise generation mechanism to offer differential privacy (DP) guarantees on the trained model. To harden Clover with security against a malicious server, we devise a series of lightweight mechanisms for integrity checks on the server-side computation. Extensive experiments show that Clover can achieve utility comparable to vanilla FL with central DP, with promising performance.

Paper Structure

This paper contains 31 sections, 5 theorems, 8 equations, 7 figures, 2 tables, 3 algorithms.

Key Result

Lemma 1

(Adaptive Composition of RDP RDP). Let $\mathcal{M}_1: \mathcal{D}\rightarrow\mathcal{R}_1$ be a mechanism satisfying ($\alpha,\tau_1$)-RDP and $\mathcal{M}_2: \mathcal{D}\times\mathcal{R}_1\rightarrow\mathcal{R}_2$ be a mechanism satisfying ($\alpha,\tau_2$)-RDP. Define their combination $\mathcal{

Figures (7)

  • Figure 1: An intuitive example of decomposing the process of applying $\pi$ on $\boldsymbol{x}^{\prime}$ into applying $\pi_2,\pi_1,\pi_0$ sequentially on $\boldsymbol{x}^{\prime}$, where $\pi_1,\pi_0$ are random permutations and $\pi_2 = \pi_1^{-1}\circ\pi_0^{-1}\circ\pi$.
  • Figure 2: Comparison of test accuracy and test loss versus the number of training rounds $T$ for FL-top$_k$ and FL-rand$_k$ on different datasets, with the density of the sparsified gradient update $\lambda=0.5\%$.
  • Figure 3: Test accuracy and test loss of Clover and the baselines on different datasets versus the number of training rounds $T$. For all methods, the overall privacy budget is set to $\varepsilon=6.59$ for MNIST, $\varepsilon=11.96$ for CIFAR-10, and $\varepsilon=9.89$ for Fashion-MNIST.
  • Figure 4: Test accuracy of Clover and the baselines on different datasets versus the privacy budget $\varepsilon$. The number of training rounds $T$ for Clover, DP-FedAvg, and FedSMP-rand$_k$ is limited at $T \le 90$ for MNIST, $T \le 250$ for CIFAR-10, and $T \le 180$ for Fashion-MNIST. For FedSel, we limit $T \le 5000$ for MNIST and Fashion-MNIST, and $T \le 10^4$ for CIFAR-10.
  • Figure 5: Comparison of inter-server communication costs for securely aggregating 100 sparse vectors under different approaches. (a) Varying the density $\lambda\in\{0.01\%,0.1\%,0.5\%,1\%\}$ and fixing $d=10^5$. (b) Varying the vector dimension $d\in\{5\times10^3,10^4,5\times10^4,10^5\}$ and fixing $\lambda=1\%$.
  • ...and 2 more figures

Theorems & Definitions (8)

  • Definition 1
  • Definition 2
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Theorem 4
  • Definition 3
  • Theorem 6