Table of Contents
Fetching ...

Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method

Masoumeh Sharafi, Soufiane Belharbi, Muhammad Osama Zeeshan, Houssem Ben Salem, Ali Etemad, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger

TL;DR

Experiments show that SFDA-PFT consistently outperforms state-of-the-art SFDA methods in privacy-sensitive FER scenarios, and operating in the latent space avoids noisy facial image generation, reduces computation, and learns discriminative embeddings for classification.

Abstract

Facial expression recognition (FER) models are widely used in video-based affective computing applications, such as human-computer interaction and healthcare monitoring. However, deep FER models often struggle with subtle expressions and high inter-subject variability, limiting performance in real-world settings. Source-free domain adaptation (SFDA) has been proposed to personalize a pretrained source model using only unlabeled target data, avoiding privacy, storage, and transmission constraints. We address a particularly challenging setting where source data is unavailable and the target data contains only neutral expressions. Existing SFDA methods are not designed for adaptation from a single target class, while generating non-neutral facial images is often unstable and expensive. To address this, we propose Source-Free Domain Adaptation with Personalized Feature Translation (SFDA-PFT), a lightweight latent-space approach. A translator is first pretrained on source data to map subject-specific style features between subjects while preserving expression information through expression-consistency and style-aware objectives. It is then adapted to neutral target data without source data or image synthesis. By operating in the latent space, SFDA-PFT avoids noisy facial image generation, reduces computation, and learns discriminative embeddings for classification. Experiments on BioVid, StressID, BAH, and Aff-Wild2 show that SFDA-PFT consistently outperforms state-of-the-art SFDA methods in privacy-sensitive FER scenarios. Our code is publicly available at: \href{https://github.com/MasoumehSharafi/SFDA-PFT}{GitHub}.

Personalized Feature Translation for Expression Recognition: An Efficient Source-Free Domain Adaptation Method

TL;DR

Experiments show that SFDA-PFT consistently outperforms state-of-the-art SFDA methods in privacy-sensitive FER scenarios, and operating in the latent space avoids noisy facial image generation, reduces computation, and learns discriminative embeddings for classification.

Abstract

Facial expression recognition (FER) models are widely used in video-based affective computing applications, such as human-computer interaction and healthcare monitoring. However, deep FER models often struggle with subtle expressions and high inter-subject variability, limiting performance in real-world settings. Source-free domain adaptation (SFDA) has been proposed to personalize a pretrained source model using only unlabeled target data, avoiding privacy, storage, and transmission constraints. We address a particularly challenging setting where source data is unavailable and the target data contains only neutral expressions. Existing SFDA methods are not designed for adaptation from a single target class, while generating non-neutral facial images is often unstable and expensive. To address this, we propose Source-Free Domain Adaptation with Personalized Feature Translation (SFDA-PFT), a lightweight latent-space approach. A translator is first pretrained on source data to map subject-specific style features between subjects while preserving expression information through expression-consistency and style-aware objectives. It is then adapted to neutral target data without source data or image synthesis. By operating in the latent space, SFDA-PFT avoids noisy facial image generation, reduces computation, and learns discriminative embeddings for classification. Experiments on BioVid, StressID, BAH, and Aff-Wild2 show that SFDA-PFT consistently outperforms state-of-the-art SFDA methods in privacy-sensitive FER scenarios. Our code is publicly available at: \href{https://github.com/MasoumehSharafi/SFDA-PFT}{GitHub}.

Paper Structure

This paper contains 29 sections, 5 equations, 10 figures, 20 tables, 2 algorithms.

Figures (10)

  • Figure 1: A comparison between standard image translation, SFDA-IT hou2021sourcefreedomainadaptation, against our SFDA-PFT on BioVid data. (a) Image translation methods operate at the pixel level and require complex mappings to align the target and source styles. (b) Our SFDA-PFT method directly translates in the source feature space, allowing for efficient personalization. (right) Accuracy, parameter counts, and FLOPs at inference highlight the trade-offs between the two approaches, with models implemented using a ResNet-18 backbone.
  • Figure 2: Overview of the proposed SFDA-PFT method. (a) During pre-training, the translator T is trained to map Sub-i features into the distribution of Sub-j from the source dataset, using a combination of style alignment and expression consistency losses. (b) During adaptation, only the feature translator T is updated using expression-consistent predictions from two different images (Image1 and Image2) of the same target subject. (c) At inference time, the trained translator $\mathbf{T}$ and the fixed source classifier $\mathbf{C}$ are used to predict expressions for target-domain inputs.
  • Figure 3: Source subject pairing on the BioVid dataset. (a) Examples of random, cosine-based, and landmark-based pairs. (b) Average ACC, with landmark-based pairing performing best.
  • Figure 4: Distribution of target samples for sub-1 in BioVid dataset across source subjects
  • Figure 5: T-SNE of source vs. translated features for Sub-1 in BioVid comparing feature-based (left) and image-based (right) translation.
  • ...and 5 more figures