Table of Contents
Fetching ...

Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation

Chenyu Li, Shiming Ge, Daichi Zhang, Jia Li

TL;DR

This work tackles masked face recognition by integrating amodal perception-inspired de-occlusion with relational knowledge distillation. A GAN-based de-occlusion module performs explicit face completion to recover appearance under masks, while a teacher-student distillation transfers structural relations from a pre-trained unmasked-face recognizer to a student operating on completed faces. The distillation uses instance-, pair-, and triplet-wise relational losses, with an identity-centered, softened instance-wise variant to improve stability when domain shift occurs. Evaluations on synthetic (CelebA, LFW) and realistic (AR) masked-face datasets show consistent accuracy gains over baselines, demonstrating the practical value of combining amodal completion with structured knowledge transfer for robust masked-face recognition.

Abstract

Many real-world applications today like video surveillance and urban governance need to address the recognition of masked faces, where content replacement by diverse masks often brings in incomplete appearance and ambiguous representation, leading to a sharp drop in accuracy. Inspired by recent progress on amodal perception, we propose to migrate the mechanism of amodal completion for the task of masked face recognition with an end-to-end de-occlusion distillation framework, which consists of two modules. The \textit{de-occlusion} module applies a generative adversarial network to perform face completion, which recovers the content under the mask and eliminates appearance ambiguity. The \textit{distillation} module takes a pre-trained general face recognition model as the teacher and transfers its knowledge to train a student for completed faces using massive online synthesized face pairs. Especially, the teacher knowledge is represented with structural relations among instances in multiple orders, which serves as a posterior regularization to enable the adaptation. In this way, the knowledge can be fully distilled and transferred to identify masked faces. Experiments on synthetic and realistic datasets show the efficacy of the proposed approach.

Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation

TL;DR

This work tackles masked face recognition by integrating amodal perception-inspired de-occlusion with relational knowledge distillation. A GAN-based de-occlusion module performs explicit face completion to recover appearance under masks, while a teacher-student distillation transfers structural relations from a pre-trained unmasked-face recognizer to a student operating on completed faces. The distillation uses instance-, pair-, and triplet-wise relational losses, with an identity-centered, softened instance-wise variant to improve stability when domain shift occurs. Evaluations on synthetic (CelebA, LFW) and realistic (AR) masked-face datasets show consistent accuracy gains over baselines, demonstrating the practical value of combining amodal completion with structured knowledge transfer for robust masked-face recognition.

Abstract

Many real-world applications today like video surveillance and urban governance need to address the recognition of masked faces, where content replacement by diverse masks often brings in incomplete appearance and ambiguous representation, leading to a sharp drop in accuracy. Inspired by recent progress on amodal perception, we propose to migrate the mechanism of amodal completion for the task of masked face recognition with an end-to-end de-occlusion distillation framework, which consists of two modules. The \textit{de-occlusion} module applies a generative adversarial network to perform face completion, which recovers the content under the mask and eliminates appearance ambiguity. The \textit{distillation} module takes a pre-trained general face recognition model as the teacher and transfers its knowledge to train a student for completed faces using massive online synthesized face pairs. Especially, the teacher knowledge is represented with structural relations among instances in multiple orders, which serves as a posterior regularization to enable the adaptation. In this way, the knowledge can be fully distilled and transferred to identify masked faces. Experiments on synthetic and realistic datasets show the efficacy of the proposed approach.
Paper Structure (16 sections, 18 equations, 5 figures, 2 tables)

This paper contains 16 sections, 18 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Inspired by the mechanism of amodal perception, we propose to solve masked face recognition via de-occlusion distillation that first enforces face completion, then inherits rich knowledge from pre-trained recognizor via distillation. In this way, both incomplete visual contents and inaccurate identity cues can be well recovered.
  • Figure 2: Overview of our proposed framework. The learning process consists of two stages. In the first stage, we initialize the input of the student model via an inpainting model. In the second stage, we do cross-quality knowledge distillation and transfer the knowledge contained in the teacher recognizor for normal faces into student recognizor by enforcing relational structure consistence. In this manner, the student network for recognizing masked faces learns representations for completed faces with the same clustering behaviors as the original ones, which could greatly benefit recognition accuracy.
  • Figure 3: Examples of the masks adopted for synthetic masked faces.
  • Figure 4: Evaluation accuracy of different models on LFW.
  • Figure 5: Loss changes with different loss settings.