OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

Caixin Kang; Yubo Chen; Shouwei Ruan; Shiji Zhao; Ruochen Zhang; Jiayi Wang; Shan Fu; Xingxing Wei

OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

Caixin Kang, Yubo Chen, Shouwei Ruan, Shiji Zhao, Ruochen Zhang, Jiayi Wang, Shan Fu, Xingxing Wei

TL;DR

OODFace introduces a comprehensive robustness benchmark for face recognition under real-world distribution shifts by designing 30 OOD scenarios (20 common corruptions and 10 appearance variations) with five severity levels, and constructs three benchmarks: LFW-C/V, CFP-C/V, and YTF-C/V. The authors evaluate 19 open-source FR models and 3 commercial APIs, supplementing with physical face-mask tests and experiments involving Vision-Language Models to explore potential solutions. They quantify robustness with metrics such as Acc_clean, Acc_cor, Acc_var, along with Relative Corruption Error ($\mathrm{RCE}$) and Relative Variations Error ($\mathrm{RVE}$), revealing that corruption robustness is not aligned with clean performance and that Data & Processing is particularly damaging. Defense strategies offer limited improvements, while some VLMs show strong robust FR potential, highlighting both promise and practical challenges for deployment and privacy. Overall, OODFace provides a unified toolkit and nuanced insights to guide future robustness improvements in FR systems, emphasizing the need for adaptable defenses and principled integration of multimodal models.

Abstract

With the rise of deep learning, facial recognition technology has seen extensive research and rapid development. Although facial recognition is considered a mature technology, we find that existing open-source models and commercial algorithms lack robustness in certain complex Out-of-Distribution (OOD) scenarios, raising concerns about the reliability of these systems. In this paper, we introduce OODFace, which explores the OOD challenges faced by facial recognition models from two perspectives: common corruptions and appearance variations. We systematically design 30 OOD scenarios across 9 major categories tailored for facial recognition. By simulating these challenges on public datasets, we establish three robustness benchmarks: LFW-C/V, CFP-FP-C/V, and YTF-C/V. We then conduct extensive experiments on 19 facial recognition models and 3 commercial APIs, along with extended physical experiments on face masks to assess their robustness. Next, we explore potential solutions from two perspectives: defense strategies and Vision-Language Models (VLMs). Based on the results, we draw several key insights, highlighting the vulnerability of facial recognition systems to OOD data and suggesting possible solutions. Additionally, we offer a unified toolkit that includes all corruption and variation types, easily extendable to other datasets. We hope that our benchmarks and findings can provide guidance for future improvements in facial recognition model robustness.

OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

TL;DR

Abstract

OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (31)