Table of Contents
Fetching ...

Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency

Guanyu Hu, Dimitrios Kollias, Eleni Papadopoulou, Paraskevi Tzouveli, Jie Wei, Xinyu Yang

TL;DR

This work addresses fairness in automatic affect analysis by demonstrating how inconsistent database partitions and uneven demographic distributions can bias results. It introduces a unified, demographic-aware partition protocol and standardized evaluation metrics across Expression Recognition, AU detection, and Valence-Arousal estimation, enabling fair cross-database comparisons. By re-partitioning six affective databases and benchmarking a wide range of models, the study shows that fairness can be achieved without sacrificing performance and, in some cases, even improves it, while revealing persistent age-related biases. The authors provide annotated data, code, and new leaderboards to promote transparent, fair benchmarking and to accelerate ethically sound progress in affective computing.

Abstract

Evaluating affect analysis methods presents challenges due to inconsistencies in database partitioning and evaluation protocols, leading to unfair and biased results. Previous studies claim continuous performance improvements, but our findings challenge such assertions. Using these insights, we propose a unified protocol for database partitioning that ensures fairness and comparability. We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition, action unit detection and valence-arousal estimation. We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison. Our annotations, code, and pre-trained models are available on \hyperlink{https://github.com/dkollias/Fair-Consistent-Affect-Analysis}{Github}.

Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency

TL;DR

This work addresses fairness in automatic affect analysis by demonstrating how inconsistent database partitions and uneven demographic distributions can bias results. It introduces a unified, demographic-aware partition protocol and standardized evaluation metrics across Expression Recognition, AU detection, and Valence-Arousal estimation, enabling fair cross-database comparisons. By re-partitioning six affective databases and benchmarking a wide range of models, the study shows that fairness can be achieved without sacrificing performance and, in some cases, even improves it, while revealing persistent age-related biases. The authors provide annotated data, code, and new leaderboards to promote transparent, fair benchmarking and to accelerate ethically sound progress in affective computing.

Abstract

Evaluating affect analysis methods presents challenges due to inconsistencies in database partitioning and evaluation protocols, leading to unfair and biased results. Previous studies claim continuous performance improvements, but our findings challenge such assertions. Using these insights, we propose a unified protocol for database partitioning that ensures fairness and comparability. We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition, action unit detection and valence-arousal estimation. We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison. Our annotations, code, and pre-trained models are available on \hyperlink{https://github.com/dkollias/Fair-Consistent-Affect-Analysis}{Github}.
Paper Structure (11 sections, 8 equations, 4 figures, 7 tables)

This paper contains 11 sections, 8 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The proposed protocol and subsequent partition of a database
  • Figure 2: 'Task Split' part of protocol & partition in case of basic expressions
  • Figure 3: 'Task Split' part of proposed protocol and partition in case of VA
  • Figure 4: 2D Valence-Arousal Histogram: A Comparison Between the Original and New Partitions of AffectNet