LRDif: Diffusion Models for Under-Display Camera Emotion Recognition
Zhifeng Wang, Kaihao Zhang, Ramesh Sankaranarayana
TL;DR
LRDif tackles FER under under-display camera (UDC) degradation by marrying a two-stage training framework with diffusion-based label restoration. The first stage builds a compact emotion prior representation (EPR) $Z$ via FPEN_S1 to guide UDCformer, while the second stage uses a diffusion model to estimate $Z$ directly from degraded UDC images, enabling robust emotion prediction. The approach combines a Dynamic UDC transformer (UDCformer) with a Dynamic Image and Landmarks Network (DILnetwork) for multi-scale feature fusion, and optimizes a total loss $\,\mathcal{L}_{total} = \,\mathcal{L}_{ce} + \,\mathcal{L}_{kl}$ that fuses cross-entropy with KL-based EPR regularization. Empirically, LRDif achieves state-of-the-art or competitive results on standard FER datasets (RAF-DB, FERPlus, KDEF) and their UDC variants (UDC-RAF-DB, UDC-FERPlus, UDC-KDEF), highlighting the practical impact for robust FER in devices with UDC hardware.
Abstract
This study introduces LRDif, a novel diffusion-based framework designed specifically for facial expression recognition (FER) within the context of under-display cameras (UDC). To address the inherent challenges posed by UDC's image degradation, such as reduced sharpness and increased noise, LRDif employs a two-stage training strategy that integrates a condensed preliminary extraction network (FPEN) and an agile transformer network (UDCformer) to effectively identify emotion labels from UDC images. By harnessing the robust distribution mapping capabilities of Diffusion Models (DMs) and the spatial dependency modeling strength of transformers, LRDif effectively overcomes the obstacles of noise and distortion inherent in UDC environments. Comprehensive experiments on standard FER datasets including RAF-DB, KDEF, and FERPlus, LRDif demonstrate state-of-the-art performance, underscoring its potential in advancing FER applications. This work not only addresses a significant gap in the literature by tackling the UDC challenge in FER but also sets a new benchmark for future research in the field.
