Table of Contents
Fetching ...

Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems

Maarten Grachten, Javier Nistal

TL;DR

The paper addresses the lack of standardized metrics for evaluating how well a generated accompaniment stem adheres to a given musical context prompt. It introduces Accompaniment Prompt Adherence (APA), a distribution-based metric built on Fréchet Audio Distance and powered by pre-trained embeddings like CLAP, defined as $APA = rac{1}{2} + rac{FAD_{C,R'} - FAD_{C,R}}{2 \, FAD_{R,R'}}$, clipped to $[0,1]$, to quantify adherence without training. APA is validated through objective perturbations and subjective listening tests, showing alignment with human judgments and sensitivity to degradations, and is implemented in an open-source Python package. The work demonstrates the practical utility of APA for evaluating and comparing music accompaniment generation systems across diverse datasets and embedding configurations.

Abstract

Generative systems of musical accompaniments are rapidly growing, yet there are no standardized metrics to evaluate how well generations align with the conditional audio prompt. We introduce a distribution-based measure called "Accompaniment Prompt Adherence" (APA), and validate it through objective experiments on synthetic data perturbations, and human listening tests. Results show that APA aligns well with human judgments of adherence and is discriminative to transformations that degrade adherence. We release a Python implementation of the metric using the widely adopted pre-trained CLAP embedding model, offering a valuable tool for evaluating and comparing accompaniment generation systems.

Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems

TL;DR

The paper addresses the lack of standardized metrics for evaluating how well a generated accompaniment stem adheres to a given musical context prompt. It introduces Accompaniment Prompt Adherence (APA), a distribution-based metric built on Fréchet Audio Distance and powered by pre-trained embeddings like CLAP, defined as , clipped to , to quantify adherence without training. APA is validated through objective perturbations and subjective listening tests, showing alignment with human judgments and sensitivity to degradations, and is implemented in an open-source Python package. The work demonstrates the practical utility of APA for evaluating and comparing music accompaniment generation systems across diverse datasets and embedding configurations.

Abstract

Generative systems of musical accompaniments are rapidly growing, yet there are no standardized metrics to evaluate how well generations align with the conditional audio prompt. We introduce a distribution-based measure called "Accompaniment Prompt Adherence" (APA), and validate it through objective experiments on synthetic data perturbations, and human listening tests. Results show that APA aligns well with human judgments of adherence and is discriminative to transformations that degrade adherence. We release a Python implementation of the metric using the widely adopted pre-trained CLAP embedding model, offering a valuable tool for evaluating and comparing accompaniment generation systems.

Paper Structure

This paper contains 16 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Absolute vs relative distances; Top: Counter-example for the naive approach, where a mismatched candidate set $C'$ is closer to (matched) reference set $R$ than the matched candidate set $C$ in absolute terms; Bottom: Given the same absolute distances, $C$ can still be closer to $R$, relative to negative anchor $R'$
  • Figure 2: Top: CLES values (see Section \ref{['sec:valid-exper']}) for (a) different mix regimes, (b) embedders , and (c) projections; Bottom: The effect of invariant and non-invariant transformations on $\mathrm{APA}$ values, for reference and candidate sets from (d) the same music collection; and (e) different collections; (f) Comparison of $\mathrm{APA}$ values against subjective human ratings of accompaniment prompt adherence.