Emotion-Agent: Unsupervised Deep Reinforcement Learning with Distribution-Prototype Reward for Continuous Emotional EEG Analysis
Zhihao Zhou, Qile Liu, Jiyuan Wang, Zhen Liang
TL;DR
This work tackles the challenge of identifying task-relevant emotional moments in continuous EEG without labeled segments. It introduces Emotion-Agent, an unsupervised framework that fuses heuristic prototype learning with distribution-aware rewards in a PPO-based RL setup to locate emotionally informative EEG segments. Prototypes capturing global emotion structure (via K-Means on differential entropy features) inform two rewards—center and inter-intra—guiding the policy to select emotionally representative segments. Empirical results on SEED and DEAP show meaningful improvements in downstream emotion recognition accuracy, demonstrating the value of unsupervised segment extraction for affective brain-computer interfaces. The approach advances unsupervised segmentation in dynamic EEG contexts and offers a practical workflow for enhancing aBCI reliability and performance.
Abstract
Continuous electroencephalography (EEG) signals are widely used in affective brain-computer interface (aBCI) applications. However, not all continuously collected EEG signals are relevant or meaningful to the task at hand (e.g., wondering thoughts). On the other hand, manually labeling the relevant parts is nearly impossible due to varying engagement patterns across different tasks and individuals. Therefore, effectively and efficiently identifying the important parts from continuous EEG recordings is crucial for downstream BCI tasks, as it directly impacts the accuracy and reliability of the results. In this paper, we propose a novel unsupervised deep reinforcement learning framework, called Emotion-Agent, to automatically identify relevant and informative emotional moments from continuous EEG signals. Specifically, Emotion-Agent involves unsupervised deep reinforcement learning combined with a heuristic algorithm. We first use the heuristic algorithm to perform an initial global search and form prototype representations of the EEG signals, which facilitates the efficient exploration of the signal space and identify potential regions of interest. Then, we design distribution-prototype reward functions to estimate the interactions between samples and prototypes, ensuring that the identified parts are both relevant and representative of the underlying emotional states. Emotion-Agent is trained using Proximal Policy Optimization (PPO) to achieve stable and efficient convergence. Our experiments compare the performance with and without Emotion-Agent. The results demonstrate that selecting relevant and informative emotional parts before inputting them into downstream tasks enhances the accuracy and reliability of aBCI applications.
