Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models

Yuanyuan Yao; Simon Geirnaert; Tinne Tuytelaars; Alexander Bertrand

Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models

Yuanyuan Yao, Simon Geirnaert, Tinne Tuytelaars, Alexander Bertrand

TL;DR

This work addresses sample-level selective auditory attention decoding (sAAD) from EEG by integrating decoding and temporal smoothing into a Markov Switching Model (MSM). The MSM jointly models the state-dependent EEG-to-speech mapping and attention dynamics via EM, enabling decay-free, per-sample attention inference with faster switch detection than traditional HMM post-processing. Empirical results on a two-speaker EEG dataset show the MSM achieves decoding accuracy comparable to HMM-based smoothing while significantly reducing switch-detection latency, highlighting practical benefits for real-time neuro-steered auditory systems. The approach is extensible to non-linear mappings and broader inference techniques to further improve robustness under low SNR conditions.

Abstract

Selective auditory attention decoding aims to identify the speaker of interest from listeners' neural signals, such as electroencephalography (EEG), in the presence of multiple concurrent speakers. Most existing methods operate at the window level, facing a trade-off between temporal resolution and decoding accuracy. Recent work has shown that hidden Markov model (HMM)-based post-processing can smooth window-level decoder outputs to improve this trade-off. Instead of using a separate smoothing step, we propose to integrate the decoding and smoothing components into a single probabilistic framework using a Markov switching model (MSM). It directly models the relationship between the EEG and speech envelopes under each attention state while incorporating the temporal dynamics of attention. This formulation enables sample-level attention decoding, with model parameters and attention states jointly estimated via the expectation-maximization algorithm. Experimental results demonstrate that this integrated MSM formulation achieves comparable decoding accuracy to HMM post-processing while providing faster attention switch detection. The code for the proposed method is available at https://github.com/YYao-42/MSM.

Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models

TL;DR

Abstract

Paper Structure (9 sections, 7 equations, 1 figure)

This paper contains 9 sections, 7 equations, 1 figure.

Introduction
Markov Switching Model
Forward-backward algorithm
EM iterations
Initialization
Experiment
Metrics and parameters
Results
Conclusion

Figures (1)

Figure 1: Performance comparison of the proposed MSM method and the HMM post-processing method (operating on 1-second windows) under different settings. The dots mark the cross-validated outcomes for each participant. Box-and-whisker plots are used to display the median and quartiles, with the whiskers encompassing the full data spread excluding outliers. "*" indicates the results of MSM are significantly lower than HMM (Wilcoxon signed-rank test, $p<0.05$). The decoding accuracy of the LS decoder without post-processing is also shown for reference. Its switch detection time is not reported, as the noisy outputs produce a highly fragmented state sequence, which does not allow computing a meaningful switch detection time.

Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models

TL;DR

Abstract

Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models

Authors

TL;DR

Abstract

Table of Contents

Figures (1)