Targeted Data Poisoning for Black-Box Audio Datasets Ownership Verification
Wassim Bouaziz, El-Mahdi El-Mhamdi, Nicolas Usunier
TL;DR
The paper addresses the challenge of dataset ownership verification (DOV) for audio data under black-box access by extending the data taggants approach. It crafts a targeted data poisoning scheme at a small rate of $1\%$ to create verifiable keys in the spectral domain, using gradient alignment to link the poisoned data to a hidden verification signal. Verification is performed via top-$k$ predictions on a set of keys and a binomial-based hypothesis test to achieve strong false-positive guarantees, with p-values often vanishing to $p \ll 10^{-3}$ under practical settings. Experiments on SpeechCommands and ESC-50 with state-of-the-art AST transformers demonstrate high detection confidence while preserving model performance and showing robustness to common data augmentations, highlighting the practicality of protecting audio datasets.
Abstract
Protecting the use of audio datasets is a major concern for data owners, particularly with the recent rise of audio deep learning models. While watermarks can be used to protect the data itself, they do not allow to identify a deep learning model trained on a protected dataset. In this paper, we adapt to audio data the recently introduced data taggants approach. Data taggants is a method to verify if a neural network was trained on a protected image dataset with top-$k$ predictions access to the model only. This method relies on a targeted data poisoning scheme by discreetly altering a small fraction (1%) of the dataset as to induce a harmless behavior on out-of-distribution data called keys. We evaluate our method on the Speechcommands and the ESC50 datasets and state of the art transformer models, and show that we can detect the use of the dataset with high confidence without loss of performance. We also show the robustness of our method against common data augmentation techniques, making it a practical method to protect audio datasets.
