Subject-Based Domain Adaptation for Facial Expression Recognition

Muhammad Osama Zeeshan; Muhammad Haseeb Aslam; Soufiane Belharbi; Alessandro Lameiras Koerich; Marco Pedersoli; Simon Bacon; Eric Granger

Subject-Based Domain Adaptation for Facial Expression Recognition

Muhammad Osama Zeeshan, Muhammad Haseeb Aslam, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

TL;DR

The paper tackles FER personalization by treating each subject as a distinct domain and applying multi-source domain adaptation to transfer learning from multiple labeled sources to an unlabeled target. It introduces a two-stage framework: (1) align source subjects using a between-subject discrepancy loss based on maximum mean discrepancy (MMD) and supervised loss; (2) generate reliable target pseudo-labels via Augmented Confident Pseudo Labels (ACPL) using augmentation-thresholding, followed by joint source-target adaptation with L_ce and L_mmd terms. Empirical results on BioVid and UNBC-McMaster show that subject-based MSDA with ACPL outperforms standard UDA and existing MSDA baselines, especially when selecting the most relevant source subjects, and scales well to a large number of sources. These findings demonstrate a practical path toward scalable, subject-aware FER systems with strong domain generalization and personalization capabilities.

Abstract

Adapting a deep learning model to a specific target individual is a challenging facial expression recognition (FER) task that may be achieved using unsupervised domain adaptation (UDA) methods. Although several UDA methods have been proposed to adapt deep FER models across source and target data sets, multiple subject-specific source domains are needed to accurately represent the intra- and inter-person variability in subject-based adaption. This paper considers the setting where domains correspond to individuals, not entire datasets. Unlike UDA, multi-source domain adaptation (MSDA) methods can leverage multiple source datasets to improve the accuracy and robustness of the target model. However, previous methods for MSDA adapt image classification models across datasets and do not scale well to a more significant number of source domains. This paper introduces a new MSDA method for subject-based domain adaptation in FER. It efficiently leverages information from multiple source subjects (labeled source domain data) to adapt a deep FER model to a single target individual (unlabeled target domain data). During adaptation, our subject-based MSDA first computes a between-source discrepancy loss to mitigate the domain shift among data from several source subjects. Then, a new strategy is employed to generate augmented confident pseudo-labels for the target subject, allowing a reduction in the domain shift between source and target subjects. Experiments performed on the challenging BioVid heat and pain dataset with 87 subjects and the UNBC-McMaster shoulder pain dataset with 25 subjects show that our subject-based MSDA can outperform state-of-the-art methods yet scale well to multiple subject-based source domains.

Subject-Based Domain Adaptation for Facial Expression Recognition

TL;DR

Abstract

Paper Structure (14 sections, 10 equations, 4 figures, 2 tables)

This paper contains 14 sections, 10 equations, 4 figures, 2 tables.

Introduction
Related Work
Unsupervised Domain Adaptation
Multi-Source Domain Adaptation
Proposed Methodology
Notation
Alignment of Source Domains
Target Adaptation with Augmented Confident Target Pseudo-Labels
Results and Discussion
Experimental Methodology
Comparison with the State-of-the-Art
Ablation Studies
Visualization of Feature Distributions
Conclusions

Figures (4)

Figure 1: The settings for domain adaptation of a deep FER model. On the left side, we have the standard setting that includes two approaches (a) single UDA approach, where one labeled dataset is adapted to a single unlabeled dataset. (b) The MSDA approach aligns multiple source datasets and adapts to the single target domain. On the right side, we have the subject-specific settings with two approaches. (c) The mix-subject UDA approach considers a single labeled source with different subject identities (IDs) aligned with unlabeled target subjects. (d) Subject-based multi-source domain adaptation considers each subject as a separate domain, mitigating the domain shift among the sources, and aligns data from selected sources with the target subject. Blue indicates labeled source data, green color indicates unlabeled target data, and grey color indicates data from both domains.
Figure 2: An illustration of our proposed subject-based MSDA method. In the first step, we align labeled source subjects using discrepancy and supervision losses. In the second step, the augmented confident pseudo-label (ACPL) strategy is applied to generate reliable target PLs and, finally, train the adaptation model using the source subjects and reliable target samples.
Figure 3: Performance comparison of Subject based MSDA, SImpAI, CMSDA, and M3SDA when the number of source subjects is 10, 30, 50, 60, and 77 adapted to 10 unlabeled target subjects.
Figure 4: (a) Comparison between techniques for generating target PLs. The cyan bar indicates a standard way of generating PL, while the orange bar indicates the EHTS approach. Our ACPL strategy by combining different augmentation, i.e., horizontal-vertical-flip, increase sharpness, and rotation 90, is shown in green bar. Finally, the ACPL technique with only horizontal-flip augmentation is indicated by the blue line. (b) A t-SNE projection of embeddings from source to target subjects SUB-1. a) represents the source-only (baseline) setting without adaptation. b) shows the representation generated from the SImpAI method. c) represents our approach feature embeddings for the target subject. Color and shape represent different classes, Blue for No-Pain and green for Pain. (Best viewed in color.)

Subject-Based Domain Adaptation for Facial Expression Recognition

TL;DR

Abstract

Subject-Based Domain Adaptation for Facial Expression Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (4)