A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Yigitcan Özer; Woosung Choi; Joan Serrà; Mayank Kumar Singh; Wei-Hsiang Liao; Yuki Mitsufuji

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Yigitcan Özer, Woosung Choi, Joan Serrà, Mayank Kumar Singh, Wei-Hsiang Liao, Yuki Mitsufuji

TL;DR

The paper proposes RAW-Bench, a standardized benchmark for evaluating deep learning-based audio watermarking under a realistic, attack-rich environment. It assembles a high-fidelity, multi-domain test set and a 20-distortion attack pipeline, including neural codecs ENCODEC and Descript, to systematically assess four baseline watermarking methods. The results show neural codecs pose the largest robustness challenge, with substantial drops in bitwise and full-message accuracy, and demonstrate that adversarial (attack-aware) training yields partial improvements but does not eliminate vulnerabilities—highlighting a fundamental tension between watermarking invisibility and neural-codec removal. The work provides a practical framework and dataset to guide future development of more resilient watermarking approaches while prompting exploration of the trade-offs with neural codecs.

Abstract

We introduce the Robust Audio Watermarking Benchmark (RAW-Bench), a benchmark for evaluating deep learning-based audio watermarking methods with standardized and systematic comparisons. To simulate real-world usage, we introduce a comprehensive audio attack pipeline with various distortions such as compression, background noise, and reverberation, along with a diverse test dataset including speech, environmental sounds, and music recordings. Evaluating four existing watermarking methods on RAW-bench reveals two main insights: (i) neural compression techniques pose the most significant challenge, even when algorithms are trained with such compressions; and (ii) training with audio attacks generally improves robustness, although it is insufficient in some cases. Furthermore, we find that specific distortions, such as polarity inversion, time stretching, or reverb, seriously affect certain methods. The evaluation framework is accessible at github.com/SonyResearch/raw_bench.

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

TL;DR

Abstract

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

TL;DR

Abstract

Paper Structure

Table of Contents