Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Dariush Wahdany; Matthew Jagielski; Adam Dziedzic; Franziska Boenisch

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Dariush Wahdany, Matthew Jagielski, Adam Dziedzic, Franziska Boenisch

TL;DR

Novel attacks against popular curation methods are introduced, demonstrating that each stage reveals information about the private dataset and that even models trained exclusively on curated public data leak membership information about the private data that guided curation.

Abstract

In machine learning, curation is used to select the most valuable data for improving both model accuracy and computational efficiency. Recently, curation has also been explored as a solution for private machine learning: rather than training directly on sensitive data, which is known to leak information through model predictions, the private data is used only to guide the selection of useful public data. The resulting model is then trained solely on curated public data. It is tempting to assume that such a model is privacy-preserving because it has never seen the private data. Yet, we show that without further protection, curation pipelines can still leak private information. Specifically, we introduce novel attacks against popular curation methods, targeting every major step: the computation of curation scores, the selection of the curated subset, and the final trained model. We demonstrate that each stage reveals information about the private dataset and that even models trained exclusively on curated public data leak membership information about the private data that guided curation. These findings highlight the previously overlooked inherent privacy risks of data curation and show that privacy assessment must extend beyond the training procedure to include the data selection process. Our differentially private adaptations of curation methods effectively mitigate leakage, indicating that formal privacy guarantees for curation are a promising direction.

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

TL;DR

Abstract

Paper Structure (65 sections, 45 equations, 37 figures, 5 tables, 5 algorithms)

This paper contains 65 sections, 45 equations, 37 figures, 5 tables, 5 algorithms.

Introduction
Background and Related Work
Attacking Data Curation Pipelines
Threat Model
Adversary Goal.
Adversary Capabilities and Knowledge.
Adapting LiRA for Curation
Score-Based Attacks
Subset Selection Attacks
Final Models Attack
Summary of Attacks
Evaluation
Attack Performance on Curation Scores
Attack Performance on Binary Selections
End-to-End Attacks on Trained Models
...and 50 more sections

Figures (37)

Figure 1: We attack private data $\mathcal{T}$ used to curate a public dataset $\mathcal{D}$. We show that the scores $s$, top-scoring subsets $\mathcal{\tilde{D}}$ and even trained models $M$ leak membership information.
Figure 2: Influence sparsity in Image-based curation. Distribution of how many public samples have each target as their nearest neighbor. The concentration at zero demonstrates that most targets have no direct influence on curation scores, necessitating our fingerprinting approach.
Figure 3: Attack success for curation scores and subsets. Image-based curation's nearest-neighbor mechanism is highly vulnerable, while TRAK's gradient averaging shows almost no leakage.
Figure 4: Attack success correlates with influence patterns from Figure \ref{['fig:nncounts']}. Cross-dataset comparison shows TPR at 1% FPR inversely correlates with the percentage of zero-influence targets.
Figure 5: End-to-end membership inference success. Image-based curation shows consistent partial leakage, while TRAK exhibits size-dependent vulnerability.
...and 32 more figures

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

TL;DR

Abstract

Curation Leaks: Membership Inference Attacks against Data Curation for Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (37)