Table of Contents
Fetching ...

Towards auditory attention decoding with noise-tagging: A pilot study

H. A. Scheppink, S. Ahmadi, P. Desain, M. Tangermann, J. Thielen

TL;DR

This pilot study makes a first step towards AAD using the noise-tagging stimulus protocol, which evokes reliable code-modulated evoked potentials, but is minimally explored in the auditory modality.

Abstract

Auditory attention decoding (AAD) aims to extract from brain activity the attended speaker amidst candidate speakers, offering promising applications for neuro-steered hearing devices and brain-computer interfacing. This pilot study makes a first step towards AAD using the noise-tagging stimulus protocol, which evokes reliable code-modulated evoked potentials, but is minimally explored in the auditory modality. Participants were sequentially presented with two Dutch speech stimuli that were amplitude-modulated with a unique binary pseudo-random noise-code, effectively tagging these with additional decodable information. We compared the decoding of unmodulated audio against audio modulated with various modulation depths, and a conventional AAD method against a standard method to decode noise-codes. Our pilot study revealed higher performances for the conventional method with 70 to 100 percent modulation depths compared to unmodulated audio. The noise-code decoder did not further improve these results. These fundamental insights highlight the potential of integrating noise-codes in speech to enhance auditory speaker detection when multiple speakers are presented simultaneously.

Towards auditory attention decoding with noise-tagging: A pilot study

TL;DR

This pilot study makes a first step towards AAD using the noise-tagging stimulus protocol, which evokes reliable code-modulated evoked potentials, but is minimally explored in the auditory modality.

Abstract

Auditory attention decoding (AAD) aims to extract from brain activity the attended speaker amidst candidate speakers, offering promising applications for neuro-steered hearing devices and brain-computer interfacing. This pilot study makes a first step towards AAD using the noise-tagging stimulus protocol, which evokes reliable code-modulated evoked potentials, but is minimally explored in the auditory modality. Participants were sequentially presented with two Dutch speech stimuli that were amplitude-modulated with a unique binary pseudo-random noise-code, effectively tagging these with additional decodable information. We compared the decoding of unmodulated audio against audio modulated with various modulation depths, and a conventional AAD method against a standard method to decode noise-codes. Our pilot study revealed higher performances for the conventional method with 70 to 100 percent modulation depths compared to unmodulated audio. The noise-code decoder did not further improve these results. These fundamental insights highlight the potential of integrating noise-codes in speech to enhance auditory speaker detection when multiple speakers are presented simultaneously.
Paper Structure (4 equations, 2 figures, 1 table)

This paper contains 4 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Visualization of three different modulation depths using noise-tagging. Depicted is the unmodulated audio, i.e., 0 percent (blue), and 50 (gold) and 100 (brown) percent modulated audio. Additionally, shown are the smoothened noise-tags used for modulation (black). Audio was amplitude-modulated by multiplying with the noise-code, retaining full audio amplitude when the code is 1, while only a percentage when it is zero. Therefore, the noise-code for 50 percent modulation ranges between 0.5--1, instead of 0--1 for 100 percent modulation. To ease comparison, we added the original audio (light gray) at the back of the modulated audio.
  • Figure 2: Decoding accuracy across decision window length and modulation depth. Depicted is the grand average classification accuracy across decision window length $\tau$. Colored lines represent the five modulation conditions: 100 (blue), 90 (orange), 70 (green), 50 (red), and 0 (pink). Solid lines show the performance of rCCA and dashed lines for eCCA. The dashed horizontal gray line depicts theoretical chance level accuracy (50%).