Table of Contents
Fetching ...

RepFace: Refining Closed-Set Noise with Progressive Label Correction for Face Recognition

Jie Zhang, Xun Gong, Zhonglin Sun

TL;DR

RepFace tackles the challenge of closed-set label noise in face recognition by stabilizing early training with Auxiliary Sample Cleaning, then dynamically partitioning data into clean, ambiguous, and noisy groups for tailored supervision. It introduces Label Robust Fusion to leverage accumulated model predictions for ambiguous samples and Smoothing Label Correction to rectify closed-set noisy labels, all guided by cosine-center similarities. The approach achieves state-of-the-art results on synthetic noise benchmarks and demonstrates strong generalization on standard FR datasets, including IJB-B and IJB-C, with robustness to both synthetic and real-world noise. The proposed framework is compatible with existing hard-sample mining losses and shows reliable convergence, with the code to be released upon acceptance.

Abstract

Face recognition has made remarkable strides, driven by the expanding scale of datasets, advancements in various backbone and discriminative losses. However, face recognition performance is heavily affected by the label noise, especially closed-set noise. While numerous studies have focused on handling label noise, addressing closed-set noise still poses challenges. This paper identifies this challenge as training isn't robust to noise at the early-stage training, and necessitating an appropriate learning strategy for samples with low confidence, which are often misclassified as closed-set noise in later training phases. To address these issues, we propose a new framework to stabilize the training at early stages and split the samples into clean, ambiguous and noisy groups which are devised with separate training strategies. Initially, we employ generated auxiliary closed-set noisy samples to enable the model to identify noisy data at the early stages of training. Subsequently, we introduce how samples are split into clean, ambiguous and noisy groups by their similarity to the positive and nearest negative centers. Then we perform label fusion for ambiguous samples by incorporating accumulated model predictions. Finally, we apply label smoothing within the closed set, adjusting the label to a point between the nearest negative class and the initially assigned label. Extensive experiments validate the effectiveness of our method on mainstream face datasets, achieving state-of-the-art results. The code will be released upon acceptance.

RepFace: Refining Closed-Set Noise with Progressive Label Correction for Face Recognition

TL;DR

RepFace tackles the challenge of closed-set label noise in face recognition by stabilizing early training with Auxiliary Sample Cleaning, then dynamically partitioning data into clean, ambiguous, and noisy groups for tailored supervision. It introduces Label Robust Fusion to leverage accumulated model predictions for ambiguous samples and Smoothing Label Correction to rectify closed-set noisy labels, all guided by cosine-center similarities. The approach achieves state-of-the-art results on synthetic noise benchmarks and demonstrates strong generalization on standard FR datasets, including IJB-B and IJB-C, with robustness to both synthetic and real-world noise. The proposed framework is compatible with existing hard-sample mining losses and shows reliable convergence, with the code to be released upon acceptance.

Abstract

Face recognition has made remarkable strides, driven by the expanding scale of datasets, advancements in various backbone and discriminative losses. However, face recognition performance is heavily affected by the label noise, especially closed-set noise. While numerous studies have focused on handling label noise, addressing closed-set noise still poses challenges. This paper identifies this challenge as training isn't robust to noise at the early-stage training, and necessitating an appropriate learning strategy for samples with low confidence, which are often misclassified as closed-set noise in later training phases. To address these issues, we propose a new framework to stabilize the training at early stages and split the samples into clean, ambiguous and noisy groups which are devised with separate training strategies. Initially, we employ generated auxiliary closed-set noisy samples to enable the model to identify noisy data at the early stages of training. Subsequently, we introduce how samples are split into clean, ambiguous and noisy groups by their similarity to the positive and nearest negative centers. Then we perform label fusion for ambiguous samples by incorporating accumulated model predictions. Finally, we apply label smoothing within the closed set, adjusting the label to a point between the nearest negative class and the initially assigned label. Extensive experiments validate the effectiveness of our method on mainstream face datasets, achieving state-of-the-art results. The code will be released upon acceptance.

Paper Structure

This paper contains 30 sections, 11 equations, 5 figures, 10 tables, 1 algorithm.

Figures (5)

  • Figure 1: Noise category. a) The label noise present in the face recognition training set, which may include both closed-set(Flip) and open-set label noise(Outlier), and b) the target dataset to be obtained through noise filtering and label correction.
  • Figure 2: The overall framework. (a) Auxiliary Sample Cleaning, randomly selected samples are assigned with random label. then we compute the normalized cosine similarity between the sample and class center to be stored in the cosine matrix. It is compared by the dynamic threshold $\eta$, which is determined by the average of the cosine similarity of the Auxiliary Sample to the assigned random label. (b) Sample splitting, we dynamically split the samples into clean, ambiguous and noise samples by comparing the sample similarity to the positive class center and sample similarity to the nearest negative centre. (c) Label Fusion. We adopt a memory bank to record the accumulated model predictions as label for stabilizing the ambiguous samples training. (d) Smooth Label Correction. We opt for smoothing the label between the positive and the nearest negative class as well as the cosine similarity logits.
  • Figure 3: Test results on LFW, AgeDB, CFP-FP, CALFW, and SLLFW with different hyperparametric $\tau$.
  • Figure 4: Loss and LFWhuang2008labeled accuracy of ArcFace deng2019arcface and the proposed method in this paper trained on 10% and 20% closed-set noise CASIA-WebFace yi2014learning dataset.
  • Figure 5: The recall and precision of label noise detection, and the label correction accuracy on 10% and 20% closed-set noise datasets.