DIG-FACE: De-biased Learning for Generalized Facial Expression Category Discovery

Tingzhang Luo; Yichao Liu; Yuanyuan Liu; Andi Zhang; Xin Wang; Yibing Zhan; Chang Tang; Leyuan Liu; Zhe Chen

DIG-FACE: De-biased Learning for Generalized Facial Expression Category Discovery

Tingzhang Luo, Yichao Liu, Yuanyuan Liu, Andi Zhang, Xin Wang, Yibing Zhan, Chang Tang, Leyuan Liu, Zhe Chen

TL;DR

This work proposes a Debiased G-FACE method, namely DIG-FACE, that facilitates the debiasing of both implicit and explicit biases, and devise a novel learning strategy that aims at estimating and minimizing the upper bound of implicit bias.

Abstract

We introduce a novel task, Generalized Facial Expression Category Discovery (G-FACE), that discovers new, unseen facial expressions while recognizing known categories effectively. Even though there are generalized category discovery methods for natural images, they show compromised performance on G-FACE. We identified two biases that affect the learning: implicit bias, coming from an underlying distributional gap between new categories in unlabeled data and known categories in labeled data, and explicit bias, coming from shifted preference on explicit visual facial change characteristics from known expressions to unknown expressions. By addressing the challenges caused by both biases, we propose a Debiased G-FACE method, namely DIG-FACE, that facilitates the debiasing of both implicit and explicit biases. In the implicit debiasing process of DIG-FACE, we devise a novel learning strategy that aims at estimating and minimizing the upper bound of implicit bias. In the explicit debiasing process, we optimize the model's ability to handle nuanced visual facial expression data by introducing a hierarchical category-discrimination refinement strategy: sample-level, triplet-level, and distribution-level optimizations. Extensive experiments demonstrate that our DIG-FACE significantly enhances recognition accuracy for both known and new categories, setting a first-of-its-kind standard for the task.

DIG-FACE: De-biased Learning for Generalized Facial Expression Category Discovery

TL;DR

Abstract

Paper Structure (30 sections, 1 theorem, 34 equations, 10 figures, 10 tables, 2 algorithms)

This paper contains 30 sections, 1 theorem, 34 equations, 10 figures, 10 tables, 2 algorithms.

Introduction
Related Work
Problem Formulation
Problem Setting
Implicit Bias
Explicit Bias
DIG-FACE Debiasing Methodology
Implicit Debiasing
Explicitly Debiasing
Overall Loss Function:
Experiments
Experimental Setup
Comparison with the State-of-the-Art methods
Ablation Studies
Analysis
...and 15 more sections

Key Result

Lemma 1

As the labels of $\mathcal{D}_U$ are not directly observable, we leverage the labeled data $\mathcal{D}_L$ to establish an upper bound for the implicit bias caused by new categories. Specifically, the implicit bias satisfies: where $\lambda = \alpha\cdot\xi_{\mathcal{D}_L}(\mathcal{H^*},\mathcal{F})+\xi_{\mathcal{D}_U}(\mathcal{H^*},\mathcal{F})$. The proof is provided in appendix.

Figures (10)

Figure 1: G-FACE aims to discover unknown (new) facial expressions while recognizing known (old) classes, both of which are present in the unlabeled data.
Figure 2: Top: Implicit bias leads to a decline in recognition accuracy for known categories in later training stages, as seen in previous GCD models like Baseline SimGCD. Bottom: Explicit bias can be analyzed by observing the overlap between known and unknown classes, which may lead to a blurred decision boundary. We present a t-SNE visualization of the baseline SimGCD.
Figure 3: Overview of DIG-FACE framework. (i) In the implicit debiasing stage, we estimate and minimize the maximum bias by enforcing consistency between the main and auxiliary head on $\mathcal{D}_L$ and inconsistency on $\mathcal{D}_U$. At this stage, $P_l$ and $P_u$ denote the pseudo-labels of labeled and unlabeled data. (ii) In the explicit debiasing stage, we enhance recognition of known and unknown categories through sample-level, triplet-level, and distribution-level optimization, and improve decision-making under G-FACE with parametric learning.
Figure 4: Attention visualization of different self-attention heads (numbered as h1 to h3) on RAF-DB. The top 10% attended patches are shown in red. Our method pays more attention to the cheeks, eyes and mouth corners details.
Figure 5: Compared to the baseline SimGCD, the features extracted by our method are more separable in t-SNE visualization.
...and 5 more figures

Theorems & Definitions (6)

Definition 1: Discrepancy Metric
Definition 2: Category-based Discrepancy Composition
Definition 3: F-discrepancy
Lemma 1: Upper Bound on Implicit Bias
Proof 1
Proof 2

DIG-FACE: De-biased Learning for Generalized Facial Expression Category Discovery

TL;DR

Abstract

DIG-FACE: De-biased Learning for Generalized Facial Expression Category Discovery

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (6)