Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real
Yan Yang, George Bebis, Mircea Nicolescu
TL;DR
Masked-face datasets are limited, hindering robust detection and recognition under occlusion. The authors propose a two-step augmentation: rule-based mask warping to generate guided inputs, followed by an unpaired image-to-image translation (AttentionGAN-like) that renders more realistic masks, with a non-mask-change loss and stochastic noise to stabilize training and boost diversity. The approach yields qualitative improvements over rule-based warping and complements IAMGAN's GAN-based generation, addressing both realism and coverage of mask variations. Limitations include potential overfitting on small datasets and limited mask-type diversity, pointing to future work on larger, more varied data and refined losses.
Abstract
Data scarcity and distribution shift pose major challenges for masked face detection and recognition. We propose a two-step generative data augmentation framework that combines rule-based mask warping with unpaired image-to-image translation using GANs, enabling the generation of realistic masked-face samples beyond purely synthetic transformations. Compared to rule-based warping alone, the proposed approach yields consistent qualitative improvements and complements existing GAN-based masked face generation methods such as IAMGAN. We introduce a non-mask preservation loss and stochastic noise injection to stabilize training and enhance sample diversity. Experimental observations highlight the effectiveness of the proposed components and suggest directions for future improvements in data-centric augmentation for face recognition tasks.
