Table of Contents
Fetching ...

HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal

Kexin Li, Xiao Hu, Ilya Grishchenko, David Lie

TL;DR

HarmonicAttack targets the growing risk of AI-generated audio by evaluating watermark robustness through a closed-box, learning-based watermark removal method. It employs a dual-path autoencoder with GAN-style training and a psychoacoustic-aware, multi-component loss to remove watermarks while preserving audio quality, achieving near real-time performance and strong cross-domain transfer. The approach demonstrates superior ASR and perceptual quality compared to baselines across speech and music, highlighting vulnerabilities in current watermarking schemes. The work motivates developing watermarking defenses that are robust to adaptive, cross-domain attacks and informs practical considerations for deployment and policy.

Abstract

The availability of high-quality, AI-generated audio raises security challenges such as misinformation campaigns and voice-cloning fraud. A key defense against the misuse of AI-generated audio is by watermarking it, so that it can be easily distinguished from genuine audio. As those seeking to misuse AI-generated audio may thus seek to remove audio watermarks, studying effective watermark removal techniques is critical to being able to objectively evaluate the robustness of audio watermarks against removal. Previous watermark removal schemes either assume impractical knowledge of the watermarks they are designed to remove or are computationally expensive, potentially generating a false sense of confidence in current watermark schemes. We introduce HarmonicAttack, an efficient audio watermark removal method that only requires the basic ability to generate the watermarks from the targeted scheme and nothing else. With this, we are able to train a general watermark removal model that is able to remove the watermarks generated by the targeted scheme from any watermarked audio sample. HarmonicAttack employs a dual-path convolutional autoencoder that operates in both temporal and frequency domains, along with GAN-style training, to separate the watermark from the original audio. When evaluated against state-of-the-art watermark schemes AudioSeal, WavMark, and Silentcipher, HarmonicAttack demonstrates greater watermark removal ability than previous watermark removal methods with near real-time performance. Moreover, while HarmonicAttack requires training, we find that it is able to transfer to out-of-distribution samples with minimal degradation in performance.

HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal

TL;DR

HarmonicAttack targets the growing risk of AI-generated audio by evaluating watermark robustness through a closed-box, learning-based watermark removal method. It employs a dual-path autoencoder with GAN-style training and a psychoacoustic-aware, multi-component loss to remove watermarks while preserving audio quality, achieving near real-time performance and strong cross-domain transfer. The approach demonstrates superior ASR and perceptual quality compared to baselines across speech and music, highlighting vulnerabilities in current watermarking schemes. The work motivates developing watermarking defenses that are robust to adaptive, cross-domain attacks and informs practical considerations for deployment and policy.

Abstract

The availability of high-quality, AI-generated audio raises security challenges such as misinformation campaigns and voice-cloning fraud. A key defense against the misuse of AI-generated audio is by watermarking it, so that it can be easily distinguished from genuine audio. As those seeking to misuse AI-generated audio may thus seek to remove audio watermarks, studying effective watermark removal techniques is critical to being able to objectively evaluate the robustness of audio watermarks against removal. Previous watermark removal schemes either assume impractical knowledge of the watermarks they are designed to remove or are computationally expensive, potentially generating a false sense of confidence in current watermark schemes. We introduce HarmonicAttack, an efficient audio watermark removal method that only requires the basic ability to generate the watermarks from the targeted scheme and nothing else. With this, we are able to train a general watermark removal model that is able to remove the watermarks generated by the targeted scheme from any watermarked audio sample. HarmonicAttack employs a dual-path convolutional autoencoder that operates in both temporal and frequency domains, along with GAN-style training, to separate the watermark from the original audio. When evaluated against state-of-the-art watermark schemes AudioSeal, WavMark, and Silentcipher, HarmonicAttack demonstrates greater watermark removal ability than previous watermark removal methods with near real-time performance. Moreover, while HarmonicAttack requires training, we find that it is able to transfer to out-of-distribution samples with minimal degradation in performance.

Paper Structure

This paper contains 26 sections, 10 equations, 8 figures, 6 tables, 2 algorithms.

Figures (8)

  • Figure 1: HarmonicAttack's overview. The approach adopts a dual-path autoencoder architecture for the watermark-removal generator, and a discriminator for GAN-style adversarial training. The watermark-removal generator processes watermarked audio to produce unwatermarked outputs, while the discriminator learns to distinguish these from the corresponding clean references. The two models are co-trained iteratively, with the discriminator's feedback guiding the generator towards improved watermark removal and perceptual fidelity.
  • Figure 2: HarmonicAttack's watermark-removal generator architecture.
  • Figure 3: HarmonicAttack's adversarial discriminator architecture.
  • Figure 4: Comparison of spectrograms for watermarked audio, HarmonicAttack removal, and AudioSquareAttack removal on FMA AudioSeal sample. HarmonicAttack is evaluated by transferring from the model trained on AudioSeal LibriSpeech samples.
  • Figure 5: Comparison of spectrograms for watermarked audio, HarmonicAttack removal, and AudioSquareAttack removal on FMA WavMark sample. HarmonicAttack is evaluated by transferring from the model trained on AudioSeal LibriSpeech samples.
  • ...and 3 more figures