Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning
Harsh Chaudhari, Giorgio Severi, Alina Oprea, Jonathan Ullman
TL;DR
Chameleon addresses a critical privacy risk in label-only membership inference by introducing an adaptive data-poisoning strategy and a neighborhood-based proxy for confidences, enabling strong leakage with as few as $64$ queries per point. The attack extends to multiple challenge points with bounded overhead and demonstrates substantial improvements over prior label-only MI methods across vision and tabular data, including CIFAR-10/100 and Purchase-100, with TPR gains up to roughly $23\%$ at $1\%$ FPR and notable AUC improvements. A theoretical MI analysis shows poisoning can amplify leakage up to an optimal point, aligning with empirical results, while experiments reveal a clear defense-utility trade-off for differential privacy. The work highlights practical privacy risks in label-only settings and suggests DP as a defense, albeit at the cost of reduced model utility, while also outlining future directions for low-FPR leakage without poisoning and more efficient defenses.
Abstract
The integration of machine learning (ML) in numerous critical applications introduces a range of privacy concerns for individuals who provide their datasets for model training. One such privacy risk is Membership Inference (MI), in which an attacker seeks to determine whether a particular data sample was included in the training dataset of a model. Current state-of-the-art MI attacks capitalize on access to the model's predicted confidence scores to successfully perform membership inference, and employ data poisoning to further enhance their effectiveness. In this work, we focus on the less explored and more realistic label-only setting, where the model provides only the predicted label on a queried sample. We show that existing label-only MI attacks are ineffective at inferring membership in the low False Positive Rate (FPR) regime. To address this challenge, we propose a new attack Chameleon that leverages a novel adaptive data poisoning strategy and an efficient query selection method to achieve significantly more accurate membership inference than existing label-only attacks, especially at low FPRs.
