Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency
Guanyu Hu, Dimitrios Kollias, Eleni Papadopoulou, Paraskevi Tzouveli, Jie Wei, Xinyu Yang
TL;DR
This work addresses fairness in automatic affect analysis by demonstrating how inconsistent database partitions and uneven demographic distributions can bias results. It introduces a unified, demographic-aware partition protocol and standardized evaluation metrics across Expression Recognition, AU detection, and Valence-Arousal estimation, enabling fair cross-database comparisons. By re-partitioning six affective databases and benchmarking a wide range of models, the study shows that fairness can be achieved without sacrificing performance and, in some cases, even improves it, while revealing persistent age-related biases. The authors provide annotated data, code, and new leaderboards to promote transparent, fair benchmarking and to accelerate ethically sound progress in affective computing.
Abstract
Evaluating affect analysis methods presents challenges due to inconsistencies in database partitioning and evaluation protocols, leading to unfair and biased results. Previous studies claim continuous performance improvements, but our findings challenge such assertions. Using these insights, we propose a unified protocol for database partitioning that ensures fairness and comparability. We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition, action unit detection and valence-arousal estimation. We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison. Our annotations, code, and pre-trained models are available on \hyperlink{https://github.com/dkollias/Fair-Consistent-Affect-Analysis}{Github}.
