Table of Contents
Fetching ...

Robust and Imperceptible Black-box DNN Watermarking Based on Fourier Perturbation Analysis and Frequency Sensitivity Clustering

Yong Liu, Hanzhou Wu, Xinpeng Zhang

TL;DR

This work addresses IP protection for DNNs under black-box access by embedding trigger-based watermarks in the frequency domain. It introduces Fourier perturbation analysis to identify mid-low frequency components that balance imperceptibility and robustness, and a frequency sensitivity clustering step to select perturbable frequencies in a model-aware manner. Watermark embedding proceeds by retraining a marked model with trigger samples labeled as a new class, while ownership verification uses trigger-set predictions under thresholding and considers attacks on the trigger data. Experimental results across CIFAR-10, CIFAR-100, and GTSRB show high task fidelity, high trigger accuracy, and strong robustness against common attacks, outperforming prior methods in imperceptibility and resilience.

Abstract

Recently, more and more attention has been focused on the intellectual property protection of deep neural networks (DNNs), promoting DNN watermarking to become a hot research topic. Compared with embedding watermarks directly into DNN parameters, inserting trigger-set watermarks enables us to verify the ownership without knowing the internal details of the DNN, which is more suitable for application scenarios. The cost is we have to carefully craft the trigger samples. Mainstream methods construct the trigger samples by inserting a noticeable pattern to the clean samples in the spatial domain, which does not consider sample imperceptibility, sample robustness and model robustness, and therefore has limited the watermarking performance and the model generalization. It has motivated the authors in this paper to propose a novel DNN watermarking method based on Fourier perturbation analysis and frequency sensitivity clustering. First, we analyze the perturbation impact of different frequency components of the input sample on the task functionality of the DNN by applying random perturbation. Then, by K-means clustering, we determine the frequency components that result in superior watermarking performance for crafting the trigger samples. Our experiments show that the proposed work not only maintains the performance of the DNN on its original task, but also provides better watermarking performance compared with related works.

Robust and Imperceptible Black-box DNN Watermarking Based on Fourier Perturbation Analysis and Frequency Sensitivity Clustering

TL;DR

This work addresses IP protection for DNNs under black-box access by embedding trigger-based watermarks in the frequency domain. It introduces Fourier perturbation analysis to identify mid-low frequency components that balance imperceptibility and robustness, and a frequency sensitivity clustering step to select perturbable frequencies in a model-aware manner. Watermark embedding proceeds by retraining a marked model with trigger samples labeled as a new class, while ownership verification uses trigger-set predictions under thresholding and considers attacks on the trigger data. Experimental results across CIFAR-10, CIFAR-100, and GTSRB show high task fidelity, high trigger accuracy, and strong robustness against common attacks, outperforming prior methods in imperceptibility and resilience.

Abstract

Recently, more and more attention has been focused on the intellectual property protection of deep neural networks (DNNs), promoting DNN watermarking to become a hot research topic. Compared with embedding watermarks directly into DNN parameters, inserting trigger-set watermarks enables us to verify the ownership without knowing the internal details of the DNN, which is more suitable for application scenarios. The cost is we have to carefully craft the trigger samples. Mainstream methods construct the trigger samples by inserting a noticeable pattern to the clean samples in the spatial domain, which does not consider sample imperceptibility, sample robustness and model robustness, and therefore has limited the watermarking performance and the model generalization. It has motivated the authors in this paper to propose a novel DNN watermarking method based on Fourier perturbation analysis and frequency sensitivity clustering. First, we analyze the perturbation impact of different frequency components of the input sample on the task functionality of the DNN by applying random perturbation. Then, by K-means clustering, we determine the frequency components that result in superior watermarking performance for crafting the trigger samples. Our experiments show that the proposed work not only maintains the performance of the DNN on its original task, but also provides better watermarking performance compared with related works.
Paper Structure (15 sections, 10 equations, 9 figures, 14 tables, 1 algorithm)

This paper contains 15 sections, 10 equations, 9 figures, 14 tables, 1 algorithm.

Figures (9)

  • Figure 1: Sketch for the proposed DNN watermarking framework, which involves trigger sample generation, watermark embedding and ownership verification.
  • Figure 2: Fourier heat maps by training $\mathcal{M}_0$ with various epochs from scratch.
  • Figure 3: Fourier heat maps and mid-low frequency masks for different models on different datasets. The first, third and fifth rows show the Fourier heat maps on CIFAR-10, CIFAR-100 and GTSRB, respectively. The second, fourth and sixth rows the corresponding mid-low frequency masks, respectively. Detailed information about the models and datasets are in the experimental section.
  • Figure 4: Classification accuracy on the trigger set for different models after watermarking with different $q_t$: (a) CIFAR-10, (b) CIFAR-100 and (c) GTSRB.
  • Figure 5: Examples of the perturbed sample: (a, c, e) original examples randomly selected from CIFAR-10, CIFAR-100 and GTSRB, respectively, (b, d, f) the corresponding perturbed samples. The representative ResNet-18 was used.
  • ...and 4 more figures