Table of Contents
Fetching ...

Bias beyond Borders: Global Inequalities in AI-Generated Music

Ahmet Solak, Florian Grötschla, Luca A. Lanzendörfer, Roger Wattenhofer

TL;DR

The paper addresses biases in AI-generated music across regions and genres, highlighting a gap due to lack of globally diverse datasets. It introduces GlobalDISCO, a large-scale resource of 73k generated tracks and 93k references covering 147 languages, 79 countries, five continents, and four commercial models. The authors evaluate generated music with multiple audio embeddings (PANNs, CLAP, MUQ-MULAN) using FAD and KAD metrics, revealing substantial disparities between high-resource and low-resource regions and between mainstream and regional genres. They release GlobalDISCO to spur research toward reducing bias and promoting global musical diversity in model development.

Abstract

While recent years have seen remarkable progress in music generation models, research on their biases across countries, languages, cultures, and musical genres remains underexplored. This gap is compounded by the lack of datasets and benchmarks that capture the global diversity of music. To address these challenges, we introduce GlobalDISCO, a large-scale dataset consisting of 73k music tracks generated by state-of-the-art commercial generative music models, along with paired links to 93k reference tracks in LAION-DISCO-12M. The dataset spans 147 languages and includes musical style prompts extracted from MusicBrainz and Wikipedia. The dataset is globally balanced, representing musical styles from artists across 79 countries and five continents. Our evaluation reveals large disparities in music quality and alignment with reference music between high-resource and low-resource regions. Furthermore, we find marked differences in model performance between mainstream and geographically niche genres, including cases where models generate music for regional genres that more closely align with the distribution of mainstream styles.

Bias beyond Borders: Global Inequalities in AI-Generated Music

TL;DR

The paper addresses biases in AI-generated music across regions and genres, highlighting a gap due to lack of globally diverse datasets. It introduces GlobalDISCO, a large-scale resource of 73k generated tracks and 93k references covering 147 languages, 79 countries, five continents, and four commercial models. The authors evaluate generated music with multiple audio embeddings (PANNs, CLAP, MUQ-MULAN) using FAD and KAD metrics, revealing substantial disparities between high-resource and low-resource regions and between mainstream and regional genres. They release GlobalDISCO to spur research toward reducing bias and promoting global musical diversity in model development.

Abstract

While recent years have seen remarkable progress in music generation models, research on their biases across countries, languages, cultures, and musical genres remains underexplored. This gap is compounded by the lack of datasets and benchmarks that capture the global diversity of music. To address these challenges, we introduce GlobalDISCO, a large-scale dataset consisting of 73k music tracks generated by state-of-the-art commercial generative music models, along with paired links to 93k reference tracks in LAION-DISCO-12M. The dataset spans 147 languages and includes musical style prompts extracted from MusicBrainz and Wikipedia. The dataset is globally balanced, representing musical styles from artists across 79 countries and five continents. Our evaluation reveals large disparities in music quality and alignment with reference music between high-resource and low-resource regions. Furthermore, we find marked differences in model performance between mainstream and geographically niche genres, including cases where models generate music for regional genres that more closely align with the distribution of mainstream styles.

Paper Structure

This paper contains 6 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Pipeline of data collection and audio generation for GlobalDISCO. We gather artist information from MusicBrainz and Wikipedia, match it with reference tracks from LAION-DISCO-12M, and construct artist profiles based on this information. These profiles are then used to generate music using state-of-the-art music generation models, resulting in a globally diverse dataset of both generated tracks and reference tracks.
  • Figure 2: World map with all 79 countries represented in GlobalDISCO with the majority language of generated music denoted by color. Each country has a minimum of 75 generated tracks, with a median of 502 and a maximum of 2,861.
  • Figure 3: An artist profile constructed with information gathered from MusicBrainz and Wikipedia. The artist’s name (in this case, a band), the names of its members, and the active dates are illustrated here with placeholders.
  • Figure 4: Mean FAD scores (lower is better), averaged across the countries for world regions. The regions are ordered by the mean z-scored FAD scores across embeddings. We find that the similarities of distributions between generated and reference tracks vary greatly between higher-resource regions (e.g., Northern America) and lower-resource regions (e.g., Sub-Saharan Africa).
  • Figure 5: Mean KAD scores (lower is better), averaged across the countries for world regions. The regions are ordered by the mean z-scored KAD scores across embeddings. Similar to the FAD results, the similarities of distributions between generated and reference tracks vary greatly between higher- and lower resource regions.
  • ...and 2 more figures