Table of Contents
Fetching ...

A General Framework for Data-Use Auditing of ML Models

Zonghao Huang, Neil Zhenqiang Gong, Michael K. Reiter

TL;DR

The paper tackles the challenge of verifying whether data-owner content was used to train ML models by introducing a general proactive auditing framework that converts black-box membership inference into a tunable, verifiable test with a controlled false-detection rate. It combines a task-agnostic data-marking scheme with a contrastive membership inference score and a sequential, PPRM-based detection procedure to estimate data-use without full access to training data. The authors demonstrate strong, robust performance on both image classifiers and foundation models (visual encoders, Llama 2, CLIP), outperforming state-of-the-art baselines and showing resilience to several adaptive countermeasures at modest utility costs. The work also discusses practical considerations, including multi-owner scenarios, costs for foundation-model experiments, and pathways toward verifiable unlearning and third-party claim verification.

Abstract

Auditing the use of data in training machine-learning (ML) models is an increasingly pressing challenge, as myriad ML practitioners routinely leverage the effort of content creators to train models without their permission. In this paper, we propose a general method to audit an ML model for the use of a data-owner's data in training, without prior knowledge of the ML task for which the data might be used. Our method leverages any existing black-box membership inference method, together with a sequential hypothesis test of our own design, to detect data use with a quantifiable, tunable false-detection rate. We show the effectiveness of our proposed framework by applying it to audit data use in two types of ML models, namely image classifiers and foundation models.

A General Framework for Data-Use Auditing of ML Models

TL;DR

The paper tackles the challenge of verifying whether data-owner content was used to train ML models by introducing a general proactive auditing framework that converts black-box membership inference into a tunable, verifiable test with a controlled false-detection rate. It combines a task-agnostic data-marking scheme with a contrastive membership inference score and a sequential, PPRM-based detection procedure to estimate data-use without full access to training data. The authors demonstrate strong, robust performance on both image classifiers and foundation models (visual encoders, Llama 2, CLIP), outperforming state-of-the-art baselines and showing resilience to several adaptive countermeasures at modest utility costs. The work also discusses practical considerations, including multi-owner scenarios, costs for foundation-model experiments, and pathways toward verifiable unlearning and third-party claim verification.

Abstract

Auditing the use of data in training machine-learning (ML) models is an increasingly pressing challenge, as myriad ML practitioners routinely leverage the effort of content creators to train models without their permission. In this paper, we propose a general method to audit an ML model for the use of a data-owner's data in training, without prior knowledge of the ML task for which the data might be used. Our method leverages any existing black-box membership inference method, together with a sequential hypothesis test of our own design, to detect data use with a quantifiable, tunable false-detection rate. We show the effectiveness of our proposed framework by applying it to audit data use in two types of ML models, namely image classifiers and foundation models.
Paper Structure (92 sections, 1 theorem, 8 equations, 6 figures, 14 tables, 4 algorithms)

This paper contains 92 sections, 1 theorem, 8 equations, 6 figures, 14 tables, 4 algorithms.

Key Result

Theorem 1

For $\sumThreshold \in \{\ceil{\frac{\totalSamples}{2}}, \ldots, \totalSamples\}$ and $\Confidence < \FalseDetectionRate$ such that $(\frac{\exp(\frac{2\sumThreshold}{\totalSamples} - 1)}{(\frac{2\sumThreshold}{\totalSamples})^{\frac{2\sumThreshold}{\totalSamples}}})^{\frac{\totalSamples}{2}} \leq \

Figures (6)

  • Figure 1: The impact of $\frac{\PublishedInTrainDatasetSize}{\totalSamples}$ on the detection performance (the default $\frac{\PublishedInTrainDatasetSize}{\totalSamples}$ is $1.0$). The results from $\frac{\PublishedInTrainDatasetSize}{\totalSamples} = 0$ are the false-detections of our method.
  • Figure 2: The impact of epochs on the detection performance and encoder utility. The evaluated encoder was trained by SimCLR on marked CIFAR-100 ($10\%$ are marked). The results are averaged over 20 experiments.
  • Figure 3: Examples of marked CIFAR-10 images ($\MarkBound=10$). First row: raw images; Second row: published images; Last row: unpublished images.
  • Figure 4: Examples of marked CIFAR-100 images ($\MarkBound=10$). First row: raw images; Second row: published images; Last row: unpublished images.
  • Figure 5: Examples of marked TinyImageNet images ($\MarkBound=10$). First row: raw images; Second row: published images; Last row: unpublished images.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 1: False detection rate