Image Watermarks are Removable Using Controllable Regeneration from Clean Noise
Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu
TL;DR
This work introduces CtrlRegen, a diffusion-model-based, no-box watermark removal attack that regenerates watermarked images from clean Gaussian noise and suppresses watermark cues through semantic and spatial conditioning. By training a semantic control adapter and a spatial control network, the method preserves semantic content and spatial layout during denoising, enabling effective removal of both low- and high-perturbation watermarks. The CtrlRegen+ variant adds controllable latent noising to adjust the trade-off between watermark destruction and image fidelity, achieving superior watermark removal while maintaining visual quality relative to prior regeneration attacks. Across diverse watermarking techniques, CtrlRegen demonstrates strong watermark removal with improved image consistency, highlighting an urgent need for more robust watermarking strategies and providing a benchmark for evaluating future defenses.
Abstract
Image watermark techniques provide an effective way to assert ownership, deter misuse, and trace content sources, which has become increasingly essential in the era of large generative models. A critical attribute of watermark techniques is their robustness against various manipulations. In this paper, we introduce a watermark removal approach capable of effectively nullifying state-of-the-art watermarking techniques. Our primary insight involves regenerating the watermarked image starting from a clean Gaussian noise via a controllable diffusion model, utilizing the extracted semantic and spatial features from the watermarked image. The semantic control adapter and the spatial control network are specifically trained to control the denoising process towards ensuring image quality and enhancing consistency between the cleaned image and the original watermarked image. To achieve a smooth trade-off between watermark removal performance and image consistency, we further propose an adjustable and controllable regeneration scheme. This scheme adds varying numbers of noise steps to the latent representation of the watermarked image, followed by a controlled denoising process starting from this noisy latent representation. As the number of noise steps increases, the latent representation progressively approaches clean Gaussian noise, facilitating the desired trade-off. We apply our watermark removal methods across various watermarking techniques, and the results demonstrate that our methods offer superior visual consistency/quality and enhanced watermark removal performance compared to existing regeneration approaches. Our code is available at https://github.com/yepengliu/CtrlRegen.
