Watermarking Training Data of Music Generation Models
Pascal Epple, Igor Shilov, Bozhidar Stevanoski, Yves-Alexandre de Montjoye
TL;DR
This work investigates whether audio watermarking can reveal unauthorized training data usage in music-generation models by watermarking training data and comparing models trained on watermarked versus clean data. It evaluates tone-based and AudioSeal watermarks, measures detectability via a watermark detector-based classifier, and assesses imperceptibility with SI-SNR and PESQ while tracking impact on model quality through FAD and $D_{KL}$ over a PaSST classifier. The findings show that watermarks can cause detectable shifts in model outputs, with detectability increasing as the proportion of watermarked data grows and with more robust watermarking, though this often degrades perceptual quality; iterative AudioSeal embedding improves detectability but harms imperceptibility. The study highlights practical implications for protecting training data and motivates further research into watermark robustness across tokenizers, different model architectures, and broader watermarking schemes for ownership verification in audio generation. These insights offer a foundation for content creators to assess whether their data have been used without consent in training music-generation systems.
Abstract
Generative Artificial Intelligence (Gen-AI) models are increasingly used to produce content across domains, including text, images, and audio. While these models represent a major technical breakthrough, they gain their generative capabilities from being trained on enormous amounts of human-generated content, which often includes copyrighted material. In this work, we investigate whether audio watermarking techniques can be used to detect an unauthorized usage of content to train a music generation model. We compare outputs generated by a model trained on watermarked data to a model trained on non-watermarked data. We study factors that impact the model's generation behaviour: the watermarking technique, the proportion of watermarked samples in the training set, and the robustness of the watermarking technique against the model's tokenizer. Our results show that audio watermarking techniques, including some that are imperceptible to humans, can lead to noticeable shifts in the model's outputs. We also study the robustness of a state-of-the-art watermarking technique to removal techniques.
