Table of Contents
Fetching ...

DeepTaster: Adversarial Perturbation-Based Fingerprinting to Identify Proprietary Dataset Use in Deep Neural Networks

Seonhye Park, Alsharif Abuadbba, Shuo Wang, Kristen Moore, Yansong Gao, Hyoungshick Kim, Surya Nepal

TL;DR

DeepTaster addresses the problem of verifying ownership of proprietary training data in DNNs when model architectures may differ due to theft or transfer learning. It introduces a two-stage, architecture-insensitive fingerprinting approach that builds a one-class DeepSVDD classifier from adversarial perturbations transformed into the Discrete Fourier Transform domain and then verifies suspect models using these features. Across CIFAR10, MNIST, and Tiny-ImageNet with ResNet18, VGG16, and DenseNet161, DeepTaster demonstrates robust detection under eight attack scenarios, often outperforming the prior DeepJudge method, and shows practical feasibility with measurable latency improvements. The work establishes a non-invasive, data-centric method for dataset provenance, offering strong protection against model and dataset theft with broad applicability in ML-as-a-Service contexts.

Abstract

Training deep neural networks (DNNs) requires large datasets and powerful computing resources, which has led some owners to restrict redistribution without permission. Watermarking techniques that embed confidential data into DNNs have been used to protect ownership, but these can degrade model performance and are vulnerable to watermark removal attacks. Recently, DeepJudge was introduced as an alternative approach to measuring the similarity between a suspect and a victim model. While DeepJudge shows promise in addressing the shortcomings of watermarking, it primarily addresses situations where the suspect model copies the victim's architecture. In this study, we introduce DeepTaster, a novel DNN fingerprinting technique, to address scenarios where a victim's data is unlawfully used to build a suspect model. DeepTaster can effectively identify such DNN model theft attacks, even when the suspect model's architecture deviates from the victim's. To accomplish this, DeepTaster generates adversarial images with perturbations, transforms them into the Fourier frequency domain, and uses these transformed images to identify the dataset used in a suspect model. The underlying premise is that adversarial images can capture the unique characteristics of DNNs built with a specific dataset. To demonstrate the effectiveness of DeepTaster, we evaluated the effectiveness of DeepTaster by assessing its detection accuracy on three datasets (CIFAR10, MNIST, and Tiny-ImageNet) across three model architectures (ResNet18, VGG16, and DenseNet161). We conducted experiments under various attack scenarios, including transfer learning, pruning, fine-tuning, and data augmentation. Specifically, in the Multi-Architecture Attack scenario, DeepTaster was able to identify all the stolen cases across all datasets, while DeepJudge failed to detect any of the cases.

DeepTaster: Adversarial Perturbation-Based Fingerprinting to Identify Proprietary Dataset Use in Deep Neural Networks

TL;DR

DeepTaster addresses the problem of verifying ownership of proprietary training data in DNNs when model architectures may differ due to theft or transfer learning. It introduces a two-stage, architecture-insensitive fingerprinting approach that builds a one-class DeepSVDD classifier from adversarial perturbations transformed into the Discrete Fourier Transform domain and then verifies suspect models using these features. Across CIFAR10, MNIST, and Tiny-ImageNet with ResNet18, VGG16, and DenseNet161, DeepTaster demonstrates robust detection under eight attack scenarios, often outperforming the prior DeepJudge method, and shows practical feasibility with measurable latency improvements. The work establishes a non-invasive, data-centric method for dataset provenance, offering strong protection against model and dataset theft with broad applicability in ML-as-a-Service contexts.

Abstract

Training deep neural networks (DNNs) requires large datasets and powerful computing resources, which has led some owners to restrict redistribution without permission. Watermarking techniques that embed confidential data into DNNs have been used to protect ownership, but these can degrade model performance and are vulnerable to watermark removal attacks. Recently, DeepJudge was introduced as an alternative approach to measuring the similarity between a suspect and a victim model. While DeepJudge shows promise in addressing the shortcomings of watermarking, it primarily addresses situations where the suspect model copies the victim's architecture. In this study, we introduce DeepTaster, a novel DNN fingerprinting technique, to address scenarios where a victim's data is unlawfully used to build a suspect model. DeepTaster can effectively identify such DNN model theft attacks, even when the suspect model's architecture deviates from the victim's. To accomplish this, DeepTaster generates adversarial images with perturbations, transforms them into the Fourier frequency domain, and uses these transformed images to identify the dataset used in a suspect model. The underlying premise is that adversarial images can capture the unique characteristics of DNNs built with a specific dataset. To demonstrate the effectiveness of DeepTaster, we evaluated the effectiveness of DeepTaster by assessing its detection accuracy on three datasets (CIFAR10, MNIST, and Tiny-ImageNet) across three model architectures (ResNet18, VGG16, and DenseNet161). We conducted experiments under various attack scenarios, including transfer learning, pruning, fine-tuning, and data augmentation. Specifically, in the Multi-Architecture Attack scenario, DeepTaster was able to identify all the stolen cases across all datasets, while DeepJudge failed to detect any of the cases.
Paper Structure (35 sections, 1 equation, 4 figures, 19 tables, 2 algorithms)

This paper contains 35 sections, 1 equation, 4 figures, 19 tables, 2 algorithms.

Figures (4)

  • Figure 1: Overview of DeepTaster.
  • Figure 2: Adversarial Image Generation.
  • Figure 3: Distribution of output scores produced by the classifier for CIFAR10 across 12 different models, each representing a combination of three architectures -- ResNet18 (RN), VGG16 (VGG), and DenseNet161 (DN) and four datasets -- CIFAR10, MNIST, Tiny-ImageNet, and ImageNet. The bold line represents the threshold chosen for DeepTaster.
  • Figure 4: Performance of classifiers with thresholds.