Table of Contents
Fetching ...

Membership Information Leakage in Federated Contrastive Learning

Kongyang Chen, Wenfeng Wang, Zixin Wang, Wangjun Zhang, Zhipeng Li, Yao Huang

TL;DR

This paper addresses privacy risks in Federated Contrastive Learning (FCL) by examining membership information leakage when training encoders on decentralized unlabeled data. It introduces two client-side attack paradigms—passive and active—with three passive variants, and demonstrates their effectiveness on SVHN, CIFAR-10, and CIFAR-100 under non-IID conditions. The results show that overfitting and access to encoder internals can enable member data to be inferred, and that an active gradient-ascent-based attack can achieve high discrimination accuracy. The work highlights practical privacy implications for FCL and suggests defenses such as differential privacy and early stopping to strengthen robustness against membership inference attacks.

Abstract

Federated Contrastive Learning (FCL) represents a burgeoning approach for learning from decentralized unlabeled data while upholding data privacy. In FCL, participant clients collaborate in learning a global encoder using unlabeled data, which can serve as a versatile feature extractor for diverse downstream tasks. Nonetheless, FCL is susceptible to privacy risks, such as membership information leakage, stemming from its distributed nature, an aspect often overlooked in current solutions. This study delves into the feasibility of executing a membership inference attack on FCL and proposes a robust attack methodology. The attacker's objective is to determine if the data signifies training member data by accessing the model's inference output. Specifically, we concentrate on attackers situated within a client framework, lacking the capability to manipulate server-side aggregation methods or discern the training status of other clients. We introduce two membership inference attacks tailored for FCL: the \textit{passive membership inference attack} and the \textit{active membership inference attack}, contingent on the attacker's involvement in local model training. Experimental findings across diverse datasets validate the effectiveness of our attacks and underscore the inherent privacy risks associated with the federated contrastive learning paradigm.

Membership Information Leakage in Federated Contrastive Learning

TL;DR

This paper addresses privacy risks in Federated Contrastive Learning (FCL) by examining membership information leakage when training encoders on decentralized unlabeled data. It introduces two client-side attack paradigms—passive and active—with three passive variants, and demonstrates their effectiveness on SVHN, CIFAR-10, and CIFAR-100 under non-IID conditions. The results show that overfitting and access to encoder internals can enable member data to be inferred, and that an active gradient-ascent-based attack can achieve high discrimination accuracy. The work highlights practical privacy implications for FCL and suggests defenses such as differential privacy and early stopping to strengthen robustness against membership inference attacks.

Abstract

Federated Contrastive Learning (FCL) represents a burgeoning approach for learning from decentralized unlabeled data while upholding data privacy. In FCL, participant clients collaborate in learning a global encoder using unlabeled data, which can serve as a versatile feature extractor for diverse downstream tasks. Nonetheless, FCL is susceptible to privacy risks, such as membership information leakage, stemming from its distributed nature, an aspect often overlooked in current solutions. This study delves into the feasibility of executing a membership inference attack on FCL and proposes a robust attack methodology. The attacker's objective is to determine if the data signifies training member data by accessing the model's inference output. Specifically, we concentrate on attackers situated within a client framework, lacking the capability to manipulate server-side aggregation methods or discern the training status of other clients. We introduce two membership inference attacks tailored for FCL: the \textit{passive membership inference attack} and the \textit{active membership inference attack}, contingent on the attacker's involvement in local model training. Experimental findings across diverse datasets validate the effectiveness of our attacks and underscore the inherent privacy risks associated with the federated contrastive learning paradigm.
Paper Structure (30 sections, 8 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 30 sections, 8 equations, 13 figures, 4 tables, 1 algorithm.

Figures (13)

  • Figure 1: System architecture of FCL.
  • Figure 2: Passive membership inference attack on FCL: The attacker refrains from disrupting the FCL training process and solely acquires the aggregated model parameters for inference.
  • Figure 3: To generate augmented data, we start with the training dataset $D$. We randomly augment each data in $D$ to produce $n$ new augmented data. These augmented data are then fed into the model's encoder to obtain feature vectors $F$. Next, we calculate the cosine similarity $SIM$ among the augmented data from the same source. Finally, we use the top three values of the highest cosine similarity (Top3) for each data as the 3D features of the input data for training the binary classifier.
  • Figure 4: Active membership inference attack on FCL: The attacker uploads modified model parameters to expedite the inference process.
  • Figure 5: The overfitting characteristics across different datasets.
  • ...and 8 more figures