Table of Contents
Fetching ...

Evaluation in EEG Emotion Recognition: State-of-the-Art Review and Unified Framework

Natia Kukhilava, Tatia Tsmindashvili, Rapael Kalandadze, Anchit Gupta, Sofio Katamadze, François Brémond, Laura M. Ferrari, Philipp Müller, Benedikt Emanuel Wirth

TL;DR

This paper addresses the fragmentation in EEG-ER evaluation by surveying 216 studies and identifying major inconsistencies in datasets, preprocessing, splitting, ground-truth definitions, and metrics. It proposes EEGain, an open-source, end-to-end framework that unifies data loading for six major datasets, standard preprocessing, LOSO/LOTO splits, four common models, and a comprehensive logging system to enable fair, reproducible benchmarking. The authors validate EEGain by replicating TSception on DEAP and conducting LOSO experiments across six datasets, establishing robust baselines and highlighting cross-subject generalization gaps. Collectively, the work provides concrete guidelines and tooling to accelerate transparent, comparable progress in EEG-ER and encourages multi-dataset evaluation for reliable state-of-the-art assessment.

Abstract

Electroencephalography-based Emotion Recognition (EEG-ER) has become a growing research area in recent years. Analyzing 216 papers published between 2018 and 2023, we uncover that the field lacks a unified evaluation protocol, which is essential to fairly define the state of the art, compare new approaches and to track the field's progress. We report the main inconsistencies between the used evaluation protocols, which are related to ground truth definition, evaluation metric selection, data splitting types (e.g., subject-dependent or subject-independent) and the use of different datasets. Capitalizing on this state-of-the-art research, we propose a unified evaluation protocol, EEGain (https://github.com/EmotionLab/EEGain), which enables an easy and efficient evaluation of new methods and datasets. EEGain is a novel open source software framework, offering the capability to compare - and thus define - state-of-the-art results. EEGain includes standardized methods for data pre-processing, data splitting, evaluation metrics, and the ability to load the six most relevant datasets (i.e., AMIGOS, DEAP, DREAMER, MAHNOB-HCI, SEED, SEED-IV) in EEG-ER with only a single line of code. In addition, we have assessed and validated EEGain using these six datasets on the four most common publicly available methods (EEGNet, DeepConvNet, ShallowConvNet, TSception). This is a significant step to make research on EEG-ER more reproducible and comparable, thereby accelerating the overall progress of the field.

Evaluation in EEG Emotion Recognition: State-of-the-Art Review and Unified Framework

TL;DR

This paper addresses the fragmentation in EEG-ER evaluation by surveying 216 studies and identifying major inconsistencies in datasets, preprocessing, splitting, ground-truth definitions, and metrics. It proposes EEGain, an open-source, end-to-end framework that unifies data loading for six major datasets, standard preprocessing, LOSO/LOTO splits, four common models, and a comprehensive logging system to enable fair, reproducible benchmarking. The authors validate EEGain by replicating TSception on DEAP and conducting LOSO experiments across six datasets, establishing robust baselines and highlighting cross-subject generalization gaps. Collectively, the work provides concrete guidelines and tooling to accelerate transparent, comparable progress in EEG-ER and encourages multi-dataset evaluation for reliable state-of-the-art assessment.

Abstract

Electroencephalography-based Emotion Recognition (EEG-ER) has become a growing research area in recent years. Analyzing 216 papers published between 2018 and 2023, we uncover that the field lacks a unified evaluation protocol, which is essential to fairly define the state of the art, compare new approaches and to track the field's progress. We report the main inconsistencies between the used evaluation protocols, which are related to ground truth definition, evaluation metric selection, data splitting types (e.g., subject-dependent or subject-independent) and the use of different datasets. Capitalizing on this state-of-the-art research, we propose a unified evaluation protocol, EEGain (https://github.com/EmotionLab/EEGain), which enables an easy and efficient evaluation of new methods and datasets. EEGain is a novel open source software framework, offering the capability to compare - and thus define - state-of-the-art results. EEGain includes standardized methods for data pre-processing, data splitting, evaluation metrics, and the ability to load the six most relevant datasets (i.e., AMIGOS, DEAP, DREAMER, MAHNOB-HCI, SEED, SEED-IV) in EEG-ER with only a single line of code. In addition, we have assessed and validated EEGain using these six datasets on the four most common publicly available methods (EEGNet, DeepConvNet, ShallowConvNet, TSception). This is a significant step to make research on EEG-ER more reproducible and comparable, thereby accelerating the overall progress of the field.

Paper Structure

This paper contains 43 sections, 10 figures, 4 tables.

Figures (10)

  • Figure 1: (a) The circumplex model of emotion, (b) The Self-Assessment Manikin (SAM).
  • Figure 2: Flow diagram illustrating the inclusion steps of publications into the literature review.
  • Figure 3: Number of publications per year included in the literature review. In total, 217 publications were included.
  • Figure 4: Number of publications using a specific number of different datasets. As can be seen, only a minority of studies used three or more datasets.
  • Figure 5: Number of studies using specific datasets.
  • ...and 5 more figures