Table of Contents
Fetching ...

BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

Yihan Wang, Yiwei Lu, Xiao-Shan Gao, Gautam Kamath, Yaoliang Yu

TL;DR

BridgePure demonstrates that limited protection leakage can almost completely nullify black-box data protections by training a diffusion denoising bridge model to approximate the inverse of the protection. Using a small set of leaked unprotected-protected pairs, it learns to purify protected data across unseen samples within the same distribution, often restoring model performance to unprotected levels. The approach outperforms prior purification methods on both classification tasks and style mimicry, while preserving image details and reducing artifacts. The findings highlight a significant security risk for current black-box protections and underscore the need for stronger, leakage-resilient defenses and governance around data protection services.

Abstract

Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools (e.g., APIs) for users to upload personal data and receive protected counterparts. In this work, we show that such black-box protections can be substantially compromised if a small set of unprotected in-distribution data is available. Specifically, we propose a novel threat model of protection leakage, where an adversary can (1) easily acquire (unprotected, protected) pairs by querying the black-box protections with a small unprotected dataset; and (2) train a diffusion bridge model to build a mapping between unprotected and protected data. This mapping, termed BridgePure, can effectively remove the protection from any previously unseen data within the same distribution. BridgePure demonstrates superior purification performance on classification and style mimicry tasks, exposing critical vulnerabilities in black-box data protection. We suggest that practitioners implement multi-level countermeasures to mitigate such risks.

BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

TL;DR

BridgePure demonstrates that limited protection leakage can almost completely nullify black-box data protections by training a diffusion denoising bridge model to approximate the inverse of the protection. Using a small set of leaked unprotected-protected pairs, it learns to purify protected data across unseen samples within the same distribution, often restoring model performance to unprotected levels. The approach outperforms prior purification methods on both classification tasks and style mimicry, while preserving image details and reducing artifacts. The findings highlight a significant security risk for current black-box protections and underscore the need for stronger, leakage-resilient defenses and governance around data protection services.

Abstract

Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools (e.g., APIs) for users to upload personal data and receive protected counterparts. In this work, we show that such black-box protections can be substantially compromised if a small set of unprotected in-distribution data is available. Specifically, we propose a novel threat model of protection leakage, where an adversary can (1) easily acquire (unprotected, protected) pairs by querying the black-box protections with a small unprotected dataset; and (2) train a diffusion bridge model to build a mapping between unprotected and protected data. This mapping, termed BridgePure, can effectively remove the protection from any previously unseen data within the same distribution. BridgePure demonstrates superior purification performance on classification and style mimicry tasks, exposing critical vulnerabilities in black-box data protection. We suggest that practitioners implement multi-level countermeasures to mitigate such risks.
Paper Structure (67 sections, 7 equations, 18 figures, 9 tables)

This paper contains 67 sections, 7 equations, 18 figures, 9 tables.

Figures (18)

  • Figure 1: The threat model and illustration of BridgePure. Sequential images show the ODE sampling (purification) process of an example image protected by One-Pixel Shortcut WuCXH23.
  • Figure 2: Performance comparison with augmentation-based methods, and protection dilution on CIFAR-100.
  • Figure 3: Purification performance against three availability attacks that SimCLR evaluates.
  • Figure 4: PSNR and SSIM between processed datasets and the original CIFAR-10.
  • Figure 5: Purification outcomes on UC-protected Cars. The left is the overview comparison and the right shows local details around the wheel. We point out (1) the light, (2) the tire, and (3) the wheel hub where BridgePure-0.5K preserves the original texture while DiffPure ($t^*=0.2$) blurs details.
  • ...and 13 more figures