Sample-level EEG-based Selective Auditory Attention Decoding with Markov Switching Models
Yuanyuan Yao, Simon Geirnaert, Tinne Tuytelaars, Alexander Bertrand
TL;DR
This work addresses sample-level selective auditory attention decoding (sAAD) from EEG by integrating decoding and temporal smoothing into a Markov Switching Model (MSM). The MSM jointly models the state-dependent EEG-to-speech mapping and attention dynamics via EM, enabling decay-free, per-sample attention inference with faster switch detection than traditional HMM post-processing. Empirical results on a two-speaker EEG dataset show the MSM achieves decoding accuracy comparable to HMM-based smoothing while significantly reducing switch-detection latency, highlighting practical benefits for real-time neuro-steered auditory systems. The approach is extensible to non-linear mappings and broader inference techniques to further improve robustness under low SNR conditions.
Abstract
Selective auditory attention decoding aims to identify the speaker of interest from listeners' neural signals, such as electroencephalography (EEG), in the presence of multiple concurrent speakers. Most existing methods operate at the window level, facing a trade-off between temporal resolution and decoding accuracy. Recent work has shown that hidden Markov model (HMM)-based post-processing can smooth window-level decoder outputs to improve this trade-off. Instead of using a separate smoothing step, we propose to integrate the decoding and smoothing components into a single probabilistic framework using a Markov switching model (MSM). It directly models the relationship between the EEG and speech envelopes under each attention state while incorporating the temporal dynamics of attention. This formulation enables sample-level attention decoding, with model parameters and attention states jointly estimated via the expectation-maximization algorithm. Experimental results demonstrate that this integrated MSM formulation achieves comparable decoding accuracy to HMM post-processing while providing faster attention switch detection. The code for the proposed method is available at https://github.com/YYao-42/MSM.
