Investigating Deep Watermark Security: An Adversarial Transferability Perspective
Biqing Qi, Junqi Gao, Yiang Luo, Jianxing Liu, Ligang Wu, Bowen Zhou
TL;DR
This work investigates the security of deep watermarking for generative content against transferable adversarial attacks. It introduces two transferable attackers, Easy Sample Matching Attack (ESMA) and Bottleneck Enhanced Mixup (BEM-ESMA), to quantify erasure and tampering risks across watermark architectures. The authors develop a theoretical framework around Local Sample Density and High Sample Density Regions (HSDR) and show that perturbations toward HSDR improve targeted transferability, with ESS enabling efficient target selection. Empirical results on ImageNet-scale data demonstrate superior targeted transferability for ESMA and BEM-ESMA compared to baselines, while comprehensive watermark erasure/tampering experiments reveal significant vulnerability across HiDDeN, Stable Signature, and FED architectures and various encoding lengths. Overall, the paper offers a robust evaluation methodology and key insights into the trade-offs between transformation robustness and deep watermark security, with implications for designing more trustworthy watermarking systems.
Abstract
The rise of generative neural networks has triggered an increased demand for intellectual property (IP) protection in generated content. Deep watermarking techniques, recognized for their flexibility in IP protection, have garnered significant attention. However, the surge in adversarial transferable attacks poses unprecedented challenges to the security of deep watermarking techniques-an area currently lacking systematic investigation. This study fills this gap by introducing two effective transferable attackers to assess the vulnerability of deep watermarks against erasure and tampering risks. Specifically, we initially define the concept of local sample density, utilizing it to deduce theorems on the consistency of model outputs. Upon discovering that perturbing samples towards high sample density regions (HSDR) of the target class enhances targeted adversarial transferability, we propose the Easy Sample Selection (ESS) mechanism and the Easy Sample Matching Attack (ESMA) method. Additionally, we propose the Bottleneck Enhanced Mixup (BEM) that integrates information bottleneck theory to reduce the generator's dependence on irrelevant noise. Experiments show a significant enhancement in the success rate of targeted transfer attacks for both ESMA and BEM-ESMA methods. We further conduct a comprehensive evaluation using ESMA and BEM-ESMA as measurements, considering model architecture and watermark encoding length, and achieve some impressive findings.
