STraDa: A Singer Traits Dataset
Yuexuan Kong, Viet-Anh Tran, Romain Hennequin
TL;DR
STraDa addresses the scarcity of large, downloadable singing-voice datasets with rich lead-singer metadata by introducing automatic-strada (large-scale, automatically cross-validated) and annotated-strada (smaller, manually balanced and labeled). It provides over 25k tracks from more than 5k lead singers and 1200 labeled segments, enabling robust SSC benchmarking and bias analysis. The authors demonstrate utility by fine-tuning an x-vector model on STraDa and evaluating biases across subgroups, achieving high SSC accuracy and uncovering gender-, age-, and language-related biases. This dataset supports reproducible singing-voice research and offers a foundation for broader MIR tasks, with future work to expand underrepresented groups and metadata coverage.
Abstract
There is a limited amount of large-scale public datasets that contain downloadable music audio files and rich lead singer metadata. To provide such a dataset to benefit research in singing voices, we created Singer Traits Dataset (STraDa) with two subsets: automatic-strada and annotated-strada. The automatic-strada contains twenty-five thousand tracks across numerous genres and languages of more than five thousand unique lead singers, which includes cross-validated lead singer metadata as well as other track metadata. The annotated-strada consists of two hundred tracks that are balanced in terms of 2 genders, 5 languages, and 4 age groups. To show its use for model training and bias analysis thanks to its metadata's richness and downloadable audio files, we benchmarked singer sex classification (SSC) and conducted bias analysis.
