Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia
Eunyeong Ko, Soowon Kim, Ha-Na Jo
TL;DR
This study targets real-time imagined-speech decoding for individuals with aphasia by coupling offline data collection with online feedback in a two-session framework. It introduces a lightweight diffusion-based decoding framework with architectural simplifications to achieve low-latency inference, validated in a single chronic anomic aphasia patient performing a four-class Korean task. Real-time results show modest overall accuracy with strong performance for the Water class, suggesting viability for communication-oriented BCI applications while highlighting the need for broader testing and vocabulary expansion. Methodologically, the work combines careful task design, subject-specific modeling, and signal-processing optimizations to bridge offline research and practical clinical use.
Abstract
Individuals with aphasia experience severe difficulty in real-time verbal communication, while most imagined speech decoding approaches remain limited to offline analysis or computationally demanding models. To address this limitation, we propose a two-session experimental framework consisting of an offline data acquisition phase and a subsequent online feedback phase for real-time imagined speech decoding. The paradigm employed a four-class Korean-language task, including three imagined speech targets selected according to the participant's daily communicative needs and a resting-state condition, and was evaluated in a single individual with chronic anomic aphasia. Within this framework, we introduce a lightweight diffusion-based neural decoding model explicitly optimized for real-time inference, achieved through architectural simplifications such as dimensionality reduction, temporal kernel optimization, group normalization with regularization, and dual early-stopping criteria. In real-time evaluation, the proposed system achieved 65 percent top-1 and 70 percent top-2 accuracy, with the Water class reaching 80 percent top-1 and 100 percent top-2 accuracy. These results demonstrate that real-time-optimized diffusion-based architectures, combined with clinically grounded task design, can support feasible online imagined speech decoding for communication-oriented BCI applications in aphasia.
