Table of Contents
Fetching ...

D2RA: Dual Domain Regeneration Attack

Pragati Shuddhodhan Meshram, Varun Chandrasekaran

TL;DR

D2RA addresses the provenance challenge posed by generative media by presenting a training-free, single-image attack that undermines both pixel- and semantic-space watermarks. It combines a frequency-domain projection with semantic refinement and perceptual color correction to suppress watermark signals while preserving perceptual and semantic fidelity, operating without access to the generator or watermark key. Empirical results across six watermarking schemes and two datasets show high removal success and favorable perceptual metrics, outperforming regeneration and optimization-based baselines in a no-box, exemplar-free setting. This work highlights a vulnerability in current watermark designs and advocates for robust, multi-domain authentication mechanisms or cryptographic verification to ensure provenance in the era of powerful generative models.

Abstract

The growing use of generative models has intensified the need for watermarking methods that ensure content attribution and provenance. While recent semantic watermarking schemes improve robustness by embedding signals in latent or frequency representations, we show they remain vulnerable even under resource-constrained adversarial settings. We present D2RA, a training-free, single-image attack that removes or weakens watermarks without access to the underlying model. By projecting watermarked images onto natural priors across complementary representations, D2RA suppresses watermark signals while preserving visual fidelity. Experiments across diverse watermarking schemes demonstrate that our approach consistently reduces watermark detectability, revealing fundamental weaknesses in current designs. Our code is available at https://github.com/Pragati-Meshram/DAWN.

D2RA: Dual Domain Regeneration Attack

TL;DR

D2RA addresses the provenance challenge posed by generative media by presenting a training-free, single-image attack that undermines both pixel- and semantic-space watermarks. It combines a frequency-domain projection with semantic refinement and perceptual color correction to suppress watermark signals while preserving perceptual and semantic fidelity, operating without access to the generator or watermark key. Empirical results across six watermarking schemes and two datasets show high removal success and favorable perceptual metrics, outperforming regeneration and optimization-based baselines in a no-box, exemplar-free setting. This work highlights a vulnerability in current watermark designs and advocates for robust, multi-domain authentication mechanisms or cryptographic verification to ensure provenance in the era of powerful generative models.

Abstract

The growing use of generative models has intensified the need for watermarking methods that ensure content attribution and provenance. While recent semantic watermarking schemes improve robustness by embedding signals in latent or frequency representations, we show they remain vulnerable even under resource-constrained adversarial settings. We present D2RA, a training-free, single-image attack that removes or weakens watermarks without access to the underlying model. By projecting watermarked images onto natural priors across complementary representations, D2RA suppresses watermark signals while preserving visual fidelity. Experiments across diverse watermarking schemes demonstrate that our approach consistently reduces watermark detectability, revealing fundamental weaknesses in current designs. Our code is available at https://github.com/Pragati-Meshram/DAWN.

Paper Structure

This paper contains 15 sections, 7 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Visual pairs of watermarked inputs (top) and outputs on attacking with D2RA (bottom). The watermark is suppressed successfully while the images remain perceptually and semantically consistent.
  • Figure 2: Frequency-domain reconstructions yield higher Tree-Ring$p$-values (blue), indicating better watermark retention, whereas pixel-based diffusion (orange) drives values far below the $0.01$ threshold.
  • Figure 3: Attack success vs. steps on Tree-Ring (SDP, MS-COCO). Our inference-only attack attains high success in one pass; optimization-based imprint-removalmuller2025black converges slowly. Regeneration (SD-v2) from zhao2024invisible fails consistently
  • Figure 4: Tree-Ring watermarked images (leftmost), results of the Imprint-Removal attack muller2025black(third column), and our attack (fifth column), Column second, fourth, and sixth represents Y-channel (luminance) of images in YCbCr space
  • Figure 5: Distribution of detector $p$-values (log$_{10}$ scale) for D2RA and its ablated variants. Boxes show interquartile range across images from SDP and MS-COCO.
  • ...and 1 more figures