Masked Face Recognition with Generative-to-Discriminative Representations
Shiming Ge, Weijia Guo, Chenyu Li, Junzheng Zhang, Yong Li, Dan Zeng
TL;DR
Masked face recognition is challenged by occlusions that degrade information needed for identity prediction. The authors propose a generative-to-discriminative framework (G2D) that cascades a generative encoder for mask-robust representations with a discriminative reformer to recover identity cues, followed by a lightweight classifier head. The backbone is learned via greedy, module-wise pretraining on synthetic masked faces using self-supervised losses and relational distillation from a pretrained teacher, yielding occlusion-robust and identity-discriminative representations. Across synthetic LFW and realistic RMFD/MLFW benchmarks, G2D achieves state-of-the-art performance while maintaining competitive normal-face recognition and practical inference speed, highlighting its potential for real-world safety and governance applications.
Abstract
Masked face recognition is important for social good but challenged by diverse occlusions that cause insufficient or inaccurate representations. In this work, we propose a unified deep network to learn generative-to-discriminative representations for facilitating masked face recognition. To this end, we split the network into three modules and learn them on synthetic masked faces in a greedy module-wise pretraining manner. First, we leverage a generative encoder pretrained for face inpainting and finetune it to represent masked faces into category-aware descriptors. Attribute to the generative encoder's ability in recovering context information, the resulting descriptors can provide occlusion-robust representations for masked faces, mitigating the effect of diverse masks. Then, we incorporate a multi-layer convolutional network as a discriminative reformer and learn it to convert the category-aware descriptors into identity-aware vectors, where the learning is effectively supervised by distilling relation knowledge from off-the-shelf face recognition model. In this way, the discriminative reformer together with the generative encoder serves as the pretrained backbone, providing general and discriminative representations towards masked faces. Finally, we cascade one fully-connected layer following by one softmax layer into a feature classifier and finetune it to identify the reformed identity-aware vectors. Extensive experiments on synthetic and realistic datasets demonstrate the effectiveness of our approach in recognizing masked faces.
