Table of Contents
Fetching ...

InFIP: An Explainable DNN Intellectual Property Protection Method based on Intrinsic Features

Mingfu Xue, Xin Wang, Yinghao Wu, Shifeng Ni, Yushu Zhang, Weiqiang Liu

TL;DR

InFIP introduces an interpretable IP protection framework for DNNs that does not modify model parameters. It uses Deep Taylor Decomposition to extract intrinsic features from input-driven neuron relevances, converting them into fingerprint images that uniquely represent a model. Ownership verification relies on SSIM-based comparison of fingerprints between a protected model and a suspected model, with demonstrated robustness to fine-tuning, pruning, watermark overwriting, and adaptive attacks on CIFAR-10 and ImageNet. The approach achieves high verification reliability with low computational overhead and provides interpretability by tying fingerprints to per-pixel model attributions. This enables practical, scalable IP protection for large-scale DNN deployments.

Abstract

Intellectual property (IP) protection for Deep Neural Networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and lack of interpretability. In this paper, for the first time, we propose an interpretable intellectual property protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using Deep Taylor Decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.

InFIP: An Explainable DNN Intellectual Property Protection Method based on Intrinsic Features

TL;DR

InFIP introduces an interpretable IP protection framework for DNNs that does not modify model parameters. It uses Deep Taylor Decomposition to extract intrinsic features from input-driven neuron relevances, converting them into fingerprint images that uniquely represent a model. Ownership verification relies on SSIM-based comparison of fingerprints between a protected model and a suspected model, with demonstrated robustness to fine-tuning, pruning, watermark overwriting, and adaptive attacks on CIFAR-10 and ImageNet. The approach achieves high verification reliability with low computational overhead and provides interpretability by tying fingerprints to per-pixel model attributions. This enables practical, scalable IP protection for large-scale DNN deployments.

Abstract

Intellectual property (IP) protection for Deep Neural Networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and lack of interpretability. In this paper, for the first time, we propose an interpretable intellectual property protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using Deep Taylor Decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.
Paper Structure (21 sections, 5 equations, 7 figures, 8 tables, 1 algorithm)

This paper contains 21 sections, 5 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: The application scenario of the proposed method.
  • Figure 2: Overview of the proposed DNN intellectual property protection method
  • Figure 3: Examples of fingerprints generated from the original model $M$, pirated model $M^{\prime}$ and non-pirated model $M^{\prime\prime}$.
  • Figure 4: Examples of fingerprints generated from ResNet-18 model and VGG-16 model.
  • Figure 5: Examples of fingerprints extracted from the ResNet-18 model with different values of $\lambda$ on ImageNet dataset.
  • ...and 2 more figures