Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

Benedetta Tondi; Andrea Costanzo; Mauro Barni

Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

Benedetta Tondi, Andrea Costanzo, Mauro Barni

TL;DR

The paper tackles robust, high-payload white-box watermarking for deep neural networks by fixing watermarked weights prior to training and freezing them, while learning the remainder of the network. It encodes the watermark with direct-sequence spread-spectrum using a secret key and optimizes the watermarked-weight distribution to minimize divergence from the non-watermarked weights, showing the optimal distribution is Laplace$(0,oldsymbol{})$. Empirically, the approach achieves very large payloads with negligible impact on primary task accuracy and demonstrates strong robustness to pruning, quantization, retraining, and transfer learning, outperforming existing methods in secrecy and scalability. The work provides a practical, theoretically grounded method for DNN watermarking with significant real-world implications for IP protection and model provenance while outlining future directions in defense against informed attackers and potential channel-coding enhancements.

Abstract

The design of an effective multi-bit watermarking algorithm hinges upon finding a good trade-off between the three fundamental requirements forming the watermarking trade-off triangle, namely, robustness against network modifications, payload, and unobtrusiveness, ensuring minimal impact on the performance of the watermarked network. In this paper, we first revisit the nature of the watermarking trade-off triangle for the DNN case, then we exploit our findings to propose a white-box, multi-bit watermarking method achieving very large payload and strong robustness against network modification. In the proposed system, the weights hosting the watermark are set prior to training, making sure that their amplitude is large enough to bear the target payload and survive network modifications, notably retraining, and are left unchanged throughout the training process. The distribution of the weights carrying the watermark is theoretically optimised to ensure the secrecy of the watermark and make sure that the watermarked weights are indistinguishable from the non-watermarked ones. The proposed method can achieve outstanding performance, with no significant impact on network accuracy, including robustness against network modifications, retraining and transfer learning, while ensuring a payload which is out of reach of state of the art methods achieving a lower - or at most comparable - robustness.

Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

TL;DR

. Empirically, the approach achieves very large payloads with negligible impact on primary task accuracy and demonstrates strong robustness to pruning, quantization, retraining, and transfer learning, outperforming existing methods in secrecy and scalability. The work provides a practical, theoretically grounded method for DNN watermarking with significant real-world implications for IP protection and model provenance while outlining future directions in defense against informed attackers and potential channel-coding enhancements.

Abstract

Paper Structure (21 sections, 1 theorem, 17 equations, 8 figures, 7 tables)

This paper contains 21 sections, 1 theorem, 17 equations, 8 figures, 7 tables.

Introduction
The watermarking trade-off triangle revisited
Prior art
Notation and problem definition
Watermarking model and requirements
The proposed DNN watermarking method
Watermark embedding
Optimization of watermarked weights distribution
Watermark extraction
Experimental methodology
Host networks and tasks
Watermarking algorithm setting
Setting of robustness experiments
Results and discussion
Performance of DNN watermarked models
...and 6 more sections

Key Result

Theorem 1

Let $\mathcal{F}$ denote the set of symmetric distributions. The minimization problem is equivalent to the problem of finding the maximum entropy distribution over all the symmetric probability density functions ${f}_{w}$ satisfying $E[|w|] = \gamma$, whose solution is $f_{w}^{*} = \frac{1}{2 \gamma} e^{-|w|/\gamma}$. Hence the optimum distribution for the watermarked weights is a L

Figures (8)

Figure 1: The watermarking trade-off triangle.
Figure 2: DNN watermarking trade-off tetrahedron.
Figure 3: Watermark embedding procedure.
Figure 4: Distribution of non-watermarked weights for non-watermarked (left) and watermarked (right) models, for XceptionNet-based GAN detection (a) ResNet-based CIFAR-10 classification (b) and DenseNet-based CIFAR-10 classification (c). The watermark settings are XceptionNet-GAN-256-1-18, ResNet-CIFAR10-1024-1-100 and DenseNet-CIFAR10-1024-1-75 respectively.
Figure 5: Distribution of the weights in the embedding layer. From top left to bottom right: XceptionNet-GAN-256-1-18 (block14$\_$sepconv2 is visualized); ResNet-CIFAR10-1024-1-100 (layer4.0.convbn$\_$2 is visualized); DenseNet-CIFAR10-1024-1-75 (dense4.29.conv1 is visualized).
...and 3 more figures

Theorems & Definitions (2)

Theorem 1
proof

Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

TL;DR

Abstract

Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)