Table of Contents
Fetching ...

Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

Zhibo Wang, Zhiwei Chang, Jiahui Hu, Xiaoyi Pang, Jiacheng Du, Yongle Chen, Kui Ren

TL;DR

A stealthy label inference attack to bypass SA and recover individual clients’ private labels and achieves large-scale label recovery with 100% accuracy on various datasets and model architectures is proposed.

Abstract

Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover individual clients' private labels. Specifically, we conduct a theoretical analysis of label inference from the aggregated gradients that are exclusively obtained after implementing SA. The analysis results reveal that the inputs (embeddings) and outputs (logits) of the final fully connected layer (FCL) contribute to gradient disaggregation and label restoration. To preset the embeddings and logits of FCL, we craft a fishing model by solely modifying the parameters of a single batch normalization (BN) layer in the original model. Distributing client-specific fishing models, the server can derive the individual gradients regarding the bias of FCL by resolving a linear system with expected embeddings and the aggregated gradients as coefficients. Then the labels of each client can be precisely computed based on preset logits and gradients of FCL's bias. Extensive experiments show that our attack achieves large-scale label recovery with 100\% accuracy on various datasets and model architectures.

Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning

TL;DR

A stealthy label inference attack to bypass SA and recover individual clients’ private labels and achieves large-scale label recovery with 100% accuracy on various datasets and model architectures is proposed.

Abstract

Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover individual clients' private labels. Specifically, we conduct a theoretical analysis of label inference from the aggregated gradients that are exclusively obtained after implementing SA. The analysis results reveal that the inputs (embeddings) and outputs (logits) of the final fully connected layer (FCL) contribute to gradient disaggregation and label restoration. To preset the embeddings and logits of FCL, we craft a fishing model by solely modifying the parameters of a single batch normalization (BN) layer in the original model. Distributing client-specific fishing models, the server can derive the individual gradients regarding the bias of FCL by resolving a linear system with expected embeddings and the aggregated gradients as coefficients. Then the labels of each client can be precisely computed based on preset logits and gradients of FCL's bias. Extensive experiments show that our attack achieves large-scale label recovery with 100\% accuracy on various datasets and model architectures.
Paper Structure (19 sections, 3 theorems, 19 equations, 6 figures, 2 tables)

This paper contains 19 sections, 3 theorems, 19 equations, 6 figures, 2 tables.

Key Result

Proposition 1

The label $i$ is presented in the training data when the one-sample gradients of FCL's bias $\nabla{b}_i \leq 0$, where $\nabla{b}_i \in \nabla{\boldsymbol{b}}$.

Figures (6)

  • Figure 1: Illustration of our Label Inference Attack against SA (LIA-SA).
  • Figure 2: Overview of the proposed Label Inference Attack against SA (LIA-SA).
  • Figure 3: The effect of batch size and number of selected clients on the LnAcc of LIA-SA(ours), LLG, and iLRG.
  • Figure 4: The effect of model depth on the label inference attacks.
  • Figure 5: Runtime evaluation of LIA-SA with varying batch size and the number of selected clients.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Proposition 3