Table of Contents
Fetching ...

Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network

Xiaodong Li, Wenchao Du, Hongyu Yang

TL;DR

The paper tackles multi-task affective behavior analysis in-the-wild (AU detection, EXPR recognition, VA estimation) by integrating a large pretrained feature extractor (Dinov2), a task-adaptive cross-attention block to obtain task-specific representations, and an AU-assisted Graph Convolutional Network to exploit AU correlations for EXPR and VA. A joint loss combining AU BCE, weighted EXPR CE, and CCC-based VA terms guides simultaneous learning across tasks. On the s-Aff-Wild2 dataset, the approach achieves a best validation score of 1.2542 and demonstrates robust 6-fold CV performance, outperforming a VGG-Face baseline. The combination of robust pretrained features, cross-task attention, and AU-relational modeling offers a practical path to improved real-world ABAW7 MTL performance.

Abstract

In this paper, we present our solution and experiment result for the Multi-Task Learning Challenge of the 7th Affective Behavior Analysis in-the-wild(ABAW7) Competition. This challenge consists of three tasks: action unit detection, facial expression recognition, and valance-arousal estimation. We address the research problems of this challenge from three aspects: 1)For learning robust visual feature representations, we introduce the pre-trained large model Dinov2. 2) To adaptively extract the required features of eack task, we design a task-adaptive block that performs cross-attention between a set of learnable query vectors and pre-extracted features. 3) By proposing the AU-assisted Graph Convolutional Network(AU-GCN), we make full use of the correlation information between AUs to assist in solving the EXPR and VA tasks. Finally, we achieve the evaluation measure of \textbf{1.2542} on the validation set provided by the organizers.

Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network

TL;DR

The paper tackles multi-task affective behavior analysis in-the-wild (AU detection, EXPR recognition, VA estimation) by integrating a large pretrained feature extractor (Dinov2), a task-adaptive cross-attention block to obtain task-specific representations, and an AU-assisted Graph Convolutional Network to exploit AU correlations for EXPR and VA. A joint loss combining AU BCE, weighted EXPR CE, and CCC-based VA terms guides simultaneous learning across tasks. On the s-Aff-Wild2 dataset, the approach achieves a best validation score of 1.2542 and demonstrates robust 6-fold CV performance, outperforming a VGG-Face baseline. The combination of robust pretrained features, cross-task attention, and AU-relational modeling offers a practical path to improved real-world ABAW7 MTL performance.

Abstract

In this paper, we present our solution and experiment result for the Multi-Task Learning Challenge of the 7th Affective Behavior Analysis in-the-wild(ABAW7) Competition. This challenge consists of three tasks: action unit detection, facial expression recognition, and valance-arousal estimation. We address the research problems of this challenge from three aspects: 1)For learning robust visual feature representations, we introduce the pre-trained large model Dinov2. 2) To adaptively extract the required features of eack task, we design a task-adaptive block that performs cross-attention between a set of learnable query vectors and pre-extracted features. 3) By proposing the AU-assisted Graph Convolutional Network(AU-GCN), we make full use of the correlation information between AUs to assist in solving the EXPR and VA tasks. Finally, we achieve the evaluation measure of \textbf{1.2542} on the validation set provided by the organizers.
Paper Structure (11 sections, 6 equations, 1 figure, 2 tables)

This paper contains 11 sections, 6 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: The pipline of our method for the MTL challenge.