Table of Contents
Fetching ...

Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models

Boheng Li, Yanhao Wei, Yankai Fu, Zhenting Wang, Yiming Li, Jie Zhang, Run Wang, Tianwei Zhang

TL;DR

SIREN is introduced, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models that optimizes the coating in a delicate way to be recognized by the model as a feature relevant to the personalization task, thus significantly improving its learnability.

Abstract

Text-to-image diffusion models are pushing the boundaries of what generative AI can achieve in our lives. Beyond their ability to generate general images, new personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution, enabling AI practitioners and developers to easily build their own personalized models, also poses a new concern regarding whether the personalized models are trained from unauthorized data. A promising solution is to proactively enable data traceability in generative models, where data owners embed external coatings (e.g., image watermarks or backdoor triggers) onto the datasets before releasing. Later the models trained over such datasets will also learn the coatings and unconsciously reproduce them in the generated mimicries, which can be extracted and used as the data usage evidence. However, we identify the existing coatings cannot be effectively learned in personalization tasks, making the corresponding verification less reliable. In this paper, we introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models. Our approach optimizes the coating in a delicate way to be recognized by the model as a feature relevant to the personalization task, thus significantly improving its learnability. We also utilize a human perceptual-aware constraint, a hypersphere classification technique, and a hypothesis-testing-guided verification method to enhance the stealthiness and detection accuracy of the coating. The effectiveness of SIREN is verified through extensive experiments on a diverse set of benchmark datasets, models, and learning algorithms. SIREN is also effective in various real-world scenarios and evaluated against potential countermeasures. Our code is publicly available.

Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models

TL;DR

SIREN is introduced, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models that optimizes the coating in a delicate way to be recognized by the model as a feature relevant to the personalization task, thus significantly improving its learnability.

Abstract

Text-to-image diffusion models are pushing the boundaries of what generative AI can achieve in our lives. Beyond their ability to generate general images, new personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution, enabling AI practitioners and developers to easily build their own personalized models, also poses a new concern regarding whether the personalized models are trained from unauthorized data. A promising solution is to proactively enable data traceability in generative models, where data owners embed external coatings (e.g., image watermarks or backdoor triggers) onto the datasets before releasing. Later the models trained over such datasets will also learn the coatings and unconsciously reproduce them in the generated mimicries, which can be extracted and used as the data usage evidence. However, we identify the existing coatings cannot be effectively learned in personalization tasks, making the corresponding verification less reliable. In this paper, we introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models. Our approach optimizes the coating in a delicate way to be recognized by the model as a feature relevant to the personalization task, thus significantly improving its learnability. We also utilize a human perceptual-aware constraint, a hypersphere classification technique, and a hypothesis-testing-guided verification method to enhance the stealthiness and detection accuracy of the coating. The effectiveness of SIREN is verified through extensive experiments on a diverse set of benchmark datasets, models, and learning algorithms. SIREN is also effective in various real-world scenarios and evaluated against potential countermeasures. Our code is publicly available.

Paper Structure

This paper contains 33 sections, 10 equations, 18 figures, 11 tables, 2 algorithms.

Figures (18)

  • Figure 1: Evaluation results of watermark-based methods on Dog ruiz2023dreambooth dataset. The personalization method is DreamBooth ruiz2023dreambooth. The model quickly learns the new concept while ignores the watermark. The bit accuracy would be $50\%$ for random guesses.
  • Figure 2: A comparison of (a) original uncoated image; with images coated by (b) the $\ell_\infty$ constraint of $11/255$, and (c) our Siren. Notably, the coating optimized by $\ell_\infty$ constraint brings unnatural artifacts on flat and bright color areas (e.g., the face of the woman), while our coating looks much more natural.
  • Figure 3: An example feature-space illustration comparing binary classification (left) with hypersphere classification (right). Direct binary classification might be biased by the incomplete negative training data, while hypersphere classification mainly focuses on positive samples and generalizes better on unseen negative data.
  • Figure 4: Effectiveness comparison in the fine-tuning personalization scenarios.
  • Figure 5: Effectiveness comparison in the advanced personalization methods. The dataset is Dog ruiz2023dreambooth.
  • ...and 13 more figures

Theorems & Definitions (1)

  • Definition 1: Feature-relevant Coating