Table of Contents
Fetching ...

Datasets for Valence and Arousal Inference: A Survey

Helen Schneider, Svetlana Pavlitska, Helen Gremmelmaier, J. Marius Zöllner

TL;DR

This paper addresses the need for a consolidated view of datasets that provide continuous valence and arousal labels for affect inference $v$ and $a$ across domains. It systematically analyzes 25 publicly available valence-arousal datasets published between 2008 and 2024, detailing modalities, sensor setups, scales, sizes, and participant distributions. Key findings include camera-based datasets dominating the field, frequent use of multimodal combinations, and rising emphasis on sensor fusion with transformer-based vision models and EEG/physiological signals. The work offers practical guidance for dataset selection, fusion strategy design, and benchmarking, advancing research in affective computing and human-computer interaction.

Abstract

Understanding human affect can be used in robotics, marketing, education, human-computer interaction, healthcare, entertainment, autonomous driving, and psychology to enhance decision-making, personalize experiences, and improve emotional well-being. This work presents a comprehensive overview of affect inference datasets that utilize continuous valence and arousal labels. We reviewed 25 datasets published between 2008 and 2024, examining key factors such as dataset size, subject distribution, sensor configurations, annotation scales, and data formats for valence and arousal values. While camera-based datasets dominate the field, we also identified several widely used multimodal combinations. Additionally, we explored the most common approaches to affect detection applied to these datasets, providing insights into the prevailing methodologies in the field. Our overview of sensor fusion approaches shows promising advancements in model improvement for valence and arousal inference.

Datasets for Valence and Arousal Inference: A Survey

TL;DR

This paper addresses the need for a consolidated view of datasets that provide continuous valence and arousal labels for affect inference and across domains. It systematically analyzes 25 publicly available valence-arousal datasets published between 2008 and 2024, detailing modalities, sensor setups, scales, sizes, and participant distributions. Key findings include camera-based datasets dominating the field, frequent use of multimodal combinations, and rising emphasis on sensor fusion with transformer-based vision models and EEG/physiological signals. The work offers practical guidance for dataset selection, fusion strategy design, and benchmarking, advancing research in affective computing and human-computer interaction.

Abstract

Understanding human affect can be used in robotics, marketing, education, human-computer interaction, healthcare, entertainment, autonomous driving, and psychology to enhance decision-making, personalize experiences, and improve emotional well-being. This work presents a comprehensive overview of affect inference datasets that utilize continuous valence and arousal labels. We reviewed 25 datasets published between 2008 and 2024, examining key factors such as dataset size, subject distribution, sensor configurations, annotation scales, and data formats for valence and arousal values. While camera-based datasets dominate the field, we also identified several widely used multimodal combinations. Additionally, we explored the most common approaches to affect detection applied to these datasets, providing insights into the prevailing methodologies in the field. Our overview of sensor fusion approaches shows promising advancements in model improvement for valence and arousal inference.

Paper Structure

This paper contains 9 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Overview of modalities used in the analyzed datasets.
  • Figure 2: Russell's circumplex model of affect dabas_emotion_2018 showing four quadrants used in classification. Additionally the continuous labels of valence (v) and arousal (a) are shown, while the categorical emotions are placed circular based on their v and a score.