Table of Contents
Fetching ...

Exploring Definitions of Quality and Diversity in Sonic Measurement Spaces

Björn Þór Jónsson, Çağrı Erdem, Stefano Fasciani, Kyrre Glette

TL;DR

The paper tackles the challenge of exploring vast sonic parameter spaces by replacing manually defined behaviour descriptors with unsupervised dimensionality reduction (PCA and autoencoders) to define and dynamically reconfigure MAP-Elites behaviour spaces. It compares static and dynamic BD configurations and three quality evaluation strategies (single-reference, multiple-reference, and reference-free), showing unsupervised BD markedly increases sonic diversity and maintains quality, while dynamic reconfiguration sustains exploration at some cost to final coverage. PCA offers the best balance of diversity and efficiency, whereas autoencoders deliver perceptual coherence at the expense of lower diversity; multiple-reference evaluation further enhances exploration. Overall, the work advances automated sonic discovery by enabling unbiased, continual exploration of large timbral spaces without supervised training, with practical implications for creative AI and interactive music systems.

Abstract

Digital sound synthesis presents the opportunity to explore vast parameter spaces containing millions of configurations. Quality diversity (QD) evolutionary algorithms offer a promising approach to harness this potential, yet their success hinges on appropriate sonic feature representations. Existing QD methods predominantly employ handcrafted descriptors or supervised classifiers, potentially introducing unintended exploration biases and constraining discovery to familiar sonic regions. This work investigates unsupervised dimensionality reduction methods for automatically defining and dynamically reconfiguring sonic behaviour spaces during QD search. We apply Principal Component Analysis (PCA) and autoencoders to project high-dimensional audio features onto structured grids for MAP-Elites, implementing dynamic reconfiguration through model retraining at regular intervals. Comparison across two experimental scenarios shows that automatic approaches achieve significantly greater diversity than handcrafted behaviour spaces while avoiding expert-imposed biases. Dynamic behaviour-space reconfiguration maintains evolutionary pressure and prevents stagnation, with PCA proving most effective among the dimensionality reduction techniques. These results contribute to automated sonic discovery systems capable of exploring vast parameter spaces without manual intervention or supervised training constraints.

Exploring Definitions of Quality and Diversity in Sonic Measurement Spaces

TL;DR

The paper tackles the challenge of exploring vast sonic parameter spaces by replacing manually defined behaviour descriptors with unsupervised dimensionality reduction (PCA and autoencoders) to define and dynamically reconfigure MAP-Elites behaviour spaces. It compares static and dynamic BD configurations and three quality evaluation strategies (single-reference, multiple-reference, and reference-free), showing unsupervised BD markedly increases sonic diversity and maintains quality, while dynamic reconfiguration sustains exploration at some cost to final coverage. PCA offers the best balance of diversity and efficiency, whereas autoencoders deliver perceptual coherence at the expense of lower diversity; multiple-reference evaluation further enhances exploration. Overall, the work advances automated sonic discovery by enabling unbiased, continual exploration of large timbral spaces without supervised training, with practical implications for creative AI and interactive music systems.

Abstract

Digital sound synthesis presents the opportunity to explore vast parameter spaces containing millions of configurations. Quality diversity (QD) evolutionary algorithms offer a promising approach to harness this potential, yet their success hinges on appropriate sonic feature representations. Existing QD methods predominantly employ handcrafted descriptors or supervised classifiers, potentially introducing unintended exploration biases and constraining discovery to familiar sonic regions. This work investigates unsupervised dimensionality reduction methods for automatically defining and dynamically reconfiguring sonic behaviour spaces during QD search. We apply Principal Component Analysis (PCA) and autoencoders to project high-dimensional audio features onto structured grids for MAP-Elites, implementing dynamic reconfiguration through model retraining at regular intervals. Comparison across two experimental scenarios shows that automatic approaches achieve significantly greater diversity than handcrafted behaviour spaces while avoiding expert-imposed biases. Dynamic behaviour-space reconfiguration maintains evolutionary pressure and prevents stagnation, with PCA proving most effective among the dimensionality reduction techniques. These results contribute to automated sonic discovery systems capable of exploring vast parameter spaces without manual intervention or supervised training constraints.

Paper Structure

This paper contains 32 sections, 6 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Visualisation of our MAP-Elites QD search pipeline for discovering sounds, with behaviour space container statuses at different stages of evolution on the top left, defined by unsupervised projection models.
  • Figure 2: Performance comparison: Automatic vs manual behaviour space definition and dynamic vs static configuration. Solid lines show mean values across 5 independent runs. Shaded areas represent 95% confidence intervals. Dynamic PCA achieves significantly higher diversity than Manual BD ($p$ < 0.001), indicating that automatic approaches can outperform expert-defined spaces. PCA shows superior diversity generation and computational efficiency, whilst autoencoders provide competitive performance with less auditory noise.
  • Figure 3: Coverage analysis when remapping unsupervised discoveries to manually defined behaviour space. Top row shows coverage achieved by different unsupervised approaches in their native spaces. Bottom row shows how these same discoveries map to the manual BD space (spectral slope vs rolloff). While direct search within the manual BD achieves the highest coverage (rightmost panel), remapped discoveries from automatic BD reach most areas of the manually defined container, demonstrating that unsupervised approaches maintain substantial coverage of expert-defined sonic characteristics while achieving higher overall diversity.
  • Figure 4: Evolutionary pathways visualization showing how behaviour space definition and redefinition affect exploration patterns. These radial tree diagrams display the phylogenetic tree relationships formed during the quality diversity sound search simulations, with initial seeds positioned at the centre and evolutionary branching patterns radiating outward. (A) Static PCA without retraining. (B) Dynamic PCA with periodic retraining and remapping. (C) Manually defined behaviour space (spectral slope vs rolloff). (D) Dynamic autoencoder with incremental fine-tuning. The vertical color bar indicates the fitness used for colouring branches.
  • Figure 5: Performance comparison of alternative reference approaches for quality evaluation. Multiple reference approaches (with k=15) achieve the highest diversity, while reference-free evaluation provides comparable performance to single-reference methods while avoiding selection bias.
  • ...and 2 more figures