Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition

Muhammad Osama Zeeshan; Marco Pedersoli; Alessandro Lameiras Koerich; Eric Granger

Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition

Muhammad Osama Zeeshan, Marco Pedersoli, Alessandro Lameiras Koerich, Eric Granger

TL;DR

This work tackles personalized facial expression recognition by reframing multi-source domain adaptation as a progressive, subject-level transfer problem. It introduces P-MSDA, which ranks source subjects by their similarity to an unlabeled target and gradually integrates only the most relevant ones, while maintaining a density-based replay memory to prevent catastrophic forgetting. The method combines curriculum learning principles with self-paced adaptation and pseudo-labeling (ACPL), plus a domain-alignment objective that includes MMD terms with a replay domain. Across BioVid, UNBC-McMaster, Aff-Wild2, BAH, and cross-dataset settings, P-MSDA consistently outperforms single-source and standard MSDA baselines, with strong gains and robustness in both CNN and ViT backbones. These results demonstrate improved personalization in FER and pain estimation under realistic, diverse conditions, with practical implications for deploying personalized FER systems while controlling computational costs.

Abstract

Personalized facial expression recognition (FER) involves adapting a machine learning model using samples from labeled sources and unlabeled target domains. Given the challenges of recognizing subtle expressions with considerable interpersonal variability, state-of-the-art unsupervised domain adaptation (UDA) methods focus on the multi-source UDA (MSDA) setting, where each domain corresponds to a specific subject, and improve model accuracy and robustness. However, when adapting to a specific target, the diverse nature of multiple source domains translates to a large shift between source and target data. State-of-the-art MSDA methods for FER address this domain shift by considering all the sources to adapt to the target representations. Nevertheless, adapting to a target subject presents significant challenges due to large distributional differences between source and target domains, often resulting in negative transfer. In addition, integrating all sources simultaneously increases computational costs and causes misalignment with the target. To address these issues, we propose a progressive MSDA approach that gradually introduces information from subjects based on their similarity to the target subject. This will ensure that only the most relevant sources from the target are selected, which helps avoid the negative transfer caused by dissimilar sources. We first exploit the closest sources to reduce the distribution shift with the target and then move towards the furthest while only considering the most relevant sources based on the predetermined threshold. Furthermore, to mitigate catastrophic forgetting caused by the incremental introduction of source subjects, we implemented a density-based memory mechanism that preserves the most relevant historical source samples for adaptation. Our extensive experiments on Biovid, UNBC-McMaster, Aff-Wild2, BAH, and in a cross-dataset setting.

Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition

TL;DR

Abstract

Progressive Multi-Source Domain Adaptation for Personalized Facial Expression Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)