Trend-Aware Supervision: On Learning Invariance for Semi-Supervised Facial Action Unit Intensity Estimation
Yingjie Chen, Jiarui Zhang, Tao Wang, Yun Liang
TL;DR
This work tackles spurious correlations in semi-supervised facial AU intensity estimation from keyframes by introducing Trend-Aware Supervision (TAS), which leverages trend information to learn invariant AU-specific features. TAS comprises intra-trend ranking, intra-trend speed, and inter-trend subject awareness, applied as four losses alongside the standard regression objective. Across BP4D and DISFA, TAS achieves state-of-the-art ICC and MAE among semi-supervised methods and remains competitive with fully supervised approaches, without increasing inference cost. The approach provides a principled way to disentangle AU-specific appearance changes from co-occurrence and subject biases, with practical value for robust facial behavior analysis under limited annotations.
Abstract
With the increasing need for facial behavior analysis, semi-supervised AU intensity estimation using only keyframe annotations has emerged as a practical and effective solution to relieve the burden of annotation. However, the lack of annotations makes the spurious correlation problem caused by AU co-occurrences and subject variation much more prominent, leading to non-robust intensity estimation that is entangled among AUs and biased among subjects. We observe that trend information inherent in keyframe annotations could act as extra supervision and raising the awareness of AU-specific facial appearance changing trends during training is the key to learning invariant AU-specific features. To this end, we propose \textbf{T}rend-\textbf{A}ware \textbf{S}upervision (TAS), which pursues three kinds of trend awareness, including intra-trend ranking awareness, intra-trend speed awareness, and inter-trend subject awareness. TAS alleviates the spurious correlation problem by raising trend awareness during training to learn AU-specific features that represent the corresponding facial appearance changes, to achieve intensity estimation invariance. Experiments conducted on two commonly used AU benchmark datasets, BP4D and DISFA, show the effectiveness of each kind of awareness. And under trend-aware supervision, the performance can be improved without extra computational or storage costs during inference.
