Revisiting the Information Capacity of Neural Network Watermarks: Upper Bound Estimation and Beyond
Fangqi Li, Haodong Zhao, Wei Du, Shilin Wang
TL;DR
This work introduces an information-theoretic capacity framework for DNN watermarks, defining $C(\delta,L)$ to capture how much identity information can be reliably transmitted under a tolerated degradation $\delta$. It provides a capacity-estimation algorithm to obtain tight upper bounds $\hat{C}(\delta,L)$ under adversarial overwriting and a universal, non-invasive approach called multiple rounds of ownership verification (MROV) to push beyond single-round limits, with a variational extension (MROV-V) to broaden applicability. The authors validate the framework across multiple watermarking schemes, showing how capacity depends on the fidelity-robustness tradeoff and how MROV and MROV-V can enhance verifiability while controlling performance loss. Overall, the study offers a principled, quantifiable approach for IP protection of DNNs and practical guidance for designing watermarking schemes that balance integrity, robustness, and efficiency.
Abstract
To trace the copyright of deep neural networks, an owner can embed its identity information into its model as a watermark. The capacity of the watermark quantify the maximal volume of information that can be verified from the watermarked model. Current studies on capacity focus on the ownership verification accuracy under ordinary removal attacks and fail to capture the relationship between robustness and fidelity. This paper studies the capacity of deep neural network watermarks from an information theoretical perspective. We propose a new definition of deep neural network watermark capacity analogous to channel capacity, analyze its properties, and design an algorithm that yields a tight estimation of its upper bound under adversarial overwriting. We also propose a universal non-invasive method to secure the transmission of the identity message beyond capacity by multiple rounds of ownership verification. Our observations provide evidence for neural network owners and defenders that are curious about the tradeoff between the integrity of their ownership and the performance degradation of their products.
