Table of Contents
Fetching ...

The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss

Gerardo Roa Dabike, Michael A. Akeroyd, Scott Bannister, Jon P. Barker, Trevor J. Cox, Bruno Fazenda, Jennifer Firth, Simone Graetzer, Alinka Greasley, Rebecca R. Vos, William M. Whitmer

TL;DR

This paper presents the first application of open challenge methodology to improve music listening for individuals with hearing loss, via two competitions (CAD1 with headphone listening and ICASSP24 with loudspeaker listening) that frame demixing into vocal, drums, bass, and other stems (VDBO) followed by remixing and NAL-R-like amplification, evaluated with the Hearing-Aid Audio Quality Index (HAAQI). While CAD1 yielded no improvement over the best baseline, ICASSP24 introduced loudspeaker effects and pre-remix gain customization that enabled several entrants to surpass baselines, including ensemble approaches. The study establishes open benchmarks with shared datasets, baselines, and evaluation tools to catalyze future research, and it motivates CAD2 to broaden tasks to lyric intelligibility and classical-music scenarios with non-linear amplification and causal processing to better suit hearing-aid and live listening contexts.

Abstract

It is well established that listening to music is an issue for those with hearing loss, and hearing aids are not a universal solution. How can machine learning be used to address this? This paper details the first application of the open challenge methodology to use machine learning to improve audio quality of music for those with hearing loss. The first challenge was a stand-alone competition (CAD1) and had 9 entrants. The second was an 2024 ICASSP grand challenge (ICASSP24) and attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalised rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Evaluation of systems was done using the objective metric HAAQI, the Hearing-Aid Audio Quality Index. No entrants improved on the best baseline in CAD1 because there was insufficient room for improvement. Consequently, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specified gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. 9 entrants scored better than the the best ICASSP24 baseline. Most entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with the software and data being freely available.

The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss

TL;DR

This paper presents the first application of open challenge methodology to improve music listening for individuals with hearing loss, via two competitions (CAD1 with headphone listening and ICASSP24 with loudspeaker listening) that frame demixing into vocal, drums, bass, and other stems (VDBO) followed by remixing and NAL-R-like amplification, evaluated with the Hearing-Aid Audio Quality Index (HAAQI). While CAD1 yielded no improvement over the best baseline, ICASSP24 introduced loudspeaker effects and pre-remix gain customization that enabled several entrants to surpass baselines, including ensemble approaches. The study establishes open benchmarks with shared datasets, baselines, and evaluation tools to catalyze future research, and it motivates CAD2 to broaden tasks to lyric intelligibility and classical-music scenarios with non-linear amplification and causal processing to better suit hearing-aid and live listening contexts.

Abstract

It is well established that listening to music is an issue for those with hearing loss, and hearing aids are not a universal solution. How can machine learning be used to address this? This paper details the first application of the open challenge methodology to use machine learning to improve audio quality of music for those with hearing loss. The first challenge was a stand-alone competition (CAD1) and had 9 entrants. The second was an 2024 ICASSP grand challenge (ICASSP24) and attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalised rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Evaluation of systems was done using the objective metric HAAQI, the Hearing-Aid Audio Quality Index. No entrants improved on the best baseline in CAD1 because there was insufficient room for improvement. Consequently, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specified gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. 9 entrants scored better than the the best ICASSP24 baseline. Most entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with the software and data being freely available.
Paper Structure (22 sections, 2 equations, 5 figures, 3 tables)

This paper contains 22 sections, 2 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: The scenarios for (a) CAD1 headphone listening and (b) ICASSP24 loudspeaker listening via hearing aids. HRTF, Head-Related Transfer Function.
  • Figure 2: General structure of the challenges
  • Figure 3: Baseline Architecture for CAD1 and ICASSP24.
  • Figure 4: HAAQI scores for remix (downmix) vs system for CAD1. Baseline systems shown in pink.
  • Figure 5: HAAQI scores vs system for ICASSP24. Baselines shown in pink.