Membership Information Leakage in Federated Contrastive Learning
Kongyang Chen, Wenfeng Wang, Zixin Wang, Wangjun Zhang, Zhipeng Li, Yao Huang
TL;DR
This paper addresses privacy risks in Federated Contrastive Learning (FCL) by examining membership information leakage when training encoders on decentralized unlabeled data. It introduces two client-side attack paradigms—passive and active—with three passive variants, and demonstrates their effectiveness on SVHN, CIFAR-10, and CIFAR-100 under non-IID conditions. The results show that overfitting and access to encoder internals can enable member data to be inferred, and that an active gradient-ascent-based attack can achieve high discrimination accuracy. The work highlights practical privacy implications for FCL and suggests defenses such as differential privacy and early stopping to strengthen robustness against membership inference attacks.
Abstract
Federated Contrastive Learning (FCL) represents a burgeoning approach for learning from decentralized unlabeled data while upholding data privacy. In FCL, participant clients collaborate in learning a global encoder using unlabeled data, which can serve as a versatile feature extractor for diverse downstream tasks. Nonetheless, FCL is susceptible to privacy risks, such as membership information leakage, stemming from its distributed nature, an aspect often overlooked in current solutions. This study delves into the feasibility of executing a membership inference attack on FCL and proposes a robust attack methodology. The attacker's objective is to determine if the data signifies training member data by accessing the model's inference output. Specifically, we concentrate on attackers situated within a client framework, lacking the capability to manipulate server-side aggregation methods or discern the training status of other clients. We introduce two membership inference attacks tailored for FCL: the \textit{passive membership inference attack} and the \textit{active membership inference attack}, contingent on the attacker's involvement in local model training. Experimental findings across diverse datasets validate the effectiveness of our attacks and underscore the inherent privacy risks associated with the federated contrastive learning paradigm.
