Table of Contents
Fetching ...

Decoupling Defense Strategies for Robust Image Watermarking

Jiahui Chen, Zehang Deng, Zeyu Zhang, Chaoyang Li, Lianchen Jia, Lifeng Sun

TL;DR

AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies, outperforms with the highest image quality and comprehensive robustness, i.e. up to 29\%, 33\% and 46\% accuracy improvement for distortion, regeneration and adversarial attacks, respectively.

Abstract

Deep learning-based image watermarking, while robust against conventional distortions, remains vulnerable to advanced adversarial and regeneration attacks. Conventional countermeasures, which jointly optimize the encoder and decoder via a noise layer, face 2 inevitable challenges: (1) decrease of clean accuracy due to decoder adversarial training and (2) limited robustness due to simultaneous training of all three advanced attacks. To overcome these issues, we propose AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies. In stage 1, we address adversarial vulnerability via a tailored adversarial training paradigm that primarily fine-tunes the encoder while only conditionally updating the decoder. This approach learns to move the image into a non-attackable region, rather than modifying the decision boundary, thus preserving clean accuracy. In stage 2, we tackle distortion and regeneration attacks via direct image optimization. To preserve the adversarial robustness gained in stage 1, we formulate a principled, constrained image loss with theoretical guarantees, which balances the deviation from cover and previous encoded images. We also propose a quality-aware early-stop to further guarantee the lower bound of visual quality. Extensive experiments demonstrate AdvMark outperforms with the highest image quality and comprehensive robustness, i.e. up to 29\%, 33\% and 46\% accuracy improvement for distortion, regeneration and adversarial attacks, respectively.

Decoupling Defense Strategies for Robust Image Watermarking

TL;DR

AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies, outperforms with the highest image quality and comprehensive robustness, i.e. up to 29\%, 33\% and 46\% accuracy improvement for distortion, regeneration and adversarial attacks, respectively.

Abstract

Deep learning-based image watermarking, while robust against conventional distortions, remains vulnerable to advanced adversarial and regeneration attacks. Conventional countermeasures, which jointly optimize the encoder and decoder via a noise layer, face 2 inevitable challenges: (1) decrease of clean accuracy due to decoder adversarial training and (2) limited robustness due to simultaneous training of all three advanced attacks. To overcome these issues, we propose AdvMark, a novel two-stage fine-tuning framework that decouples the defense strategies. In stage 1, we address adversarial vulnerability via a tailored adversarial training paradigm that primarily fine-tunes the encoder while only conditionally updating the decoder. This approach learns to move the image into a non-attackable region, rather than modifying the decision boundary, thus preserving clean accuracy. In stage 2, we tackle distortion and regeneration attacks via direct image optimization. To preserve the adversarial robustness gained in stage 1, we formulate a principled, constrained image loss with theoretical guarantees, which balances the deviation from cover and previous encoded images. We also propose a quality-aware early-stop to further guarantee the lower bound of visual quality. Extensive experiments demonstrate AdvMark outperforms with the highest image quality and comprehensive robustness, i.e. up to 29\%, 33\% and 46\% accuracy improvement for distortion, regeneration and adversarial attacks, respectively.
Paper Structure (20 sections, 1 theorem, 12 equations, 11 figures, 5 tables, 2 algorithms)

This paper contains 20 sections, 1 theorem, 12 equations, 11 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

Given a robust $x_{w_1}$, i.e. $f(x_{w_1})=f(x_{w_1}+\eta_1)$,$\forall \ \Vert\eta_1 \Vert\le\alpha$, where $f$ maps images into messages. Let assumption hold, then $x_{w_2}$ is also robust with an adjusted budget, i.e. $f(x_{w_2})=f(x_{w_2}+\eta_2)$,$\forall \ \Vert\eta_2 \Vert\le \alpha-\delta$.

Figures (11)

  • Figure 1: Four training paradigms. Adversarial training (2) degrades clean accuracy of $y_1$ and exhibits limited improvement, while moving image (3+4) suffices in both terms.
  • Figure 2: The bit accuracy $\uparrow$ of against three representative distortion, regeneration and adversarial attacks. JAT and EAT denote joint and encoder-based adversarial training.
  • Figure 3: Overview of AdvMark. In stage 1 we mainly fine-tune the encoder with slight decoder training to tackle adversarial attack. In stage 2 we directly optimize the encoded image to address the rest two attacks while preserving adversarial robustness.
  • Figure 4: Illustration of stage 1. We constantly fine-tune the encoder to map the image (No. 1) towards the non-attackable center (No. n). The decoder is updated only when the final image fails to suffice (No. n of image 2).
  • Figure 5: Illustration of stage 2. $0 \rightarrow 1$: initialize image 1 from encoder, which exhibits only adversarial robustness; 2: comprehensive robustness but low visual quality; 3: high quality yet vulnerable to A attack again; 4: similar to image 2; $\star$: high quality and robustness.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Theorem 1