LH2Face: Loss function for Hard High-quality Face
Fan Xie, Yang Wang, Yikang Jiao, Zhenyu Yuan, Congxi Chen, Chuanxin Zhao
TL;DR
LH2Face addresses the challenge of recognizing hard high-quality faces by introducing a vMF-based similarity with an adaptive margin (Uncertainty-Aware Margin Function), complemented by proxy-based losses to shape the proxy–sample space and a face-reconstruction renderer to inject 3D cues into FR training. The method explicitly ties margin and concentration to sample quality via $m = 0.35\mu_{|\mathbf{z}|}$ and $\kappa = |\mathbf{z}|$, and optimizes the overall loss $\mathcal{L}_{\text{FR}} = \mathcal{L}_{\text{vMF}} + \mathcal{L}_{\text{proxy-based}}$, where $\mathcal{L}_{\text{proxy-based}} = \mathcal{L}_{\text{pps}} + \mathcal{L}_{\text{pns}} + \mathcal{L}_{\text{pp}}$. A reconstruction branch adds $\mathcal{L}_{\text{train}} = \mathcal{L}_{\text{FR}} + \lambda_{\text{reco}}\mathcal{L}_{\text{reco}} + \lambda_{\text{canon}}\mathcal{L}_{\text{FR}}^{\text{canon}} + \lambda_{\text{view}}\mathcal{L}_{\text{view}}$ to jointly optimize FR and 3D-aware reconstruction. Empirical results on CPLFW, IJB-B/C, and related high-quality datasets show improvements over strong baselines, validating the approach and highlighting reconstruction as a productive auxiliary signal, while acknowledging limitations on very low-quality data and suggesting diffusion/GAN-based reconstruction as future work.
Abstract
In current practical face authentication systems, most face recognition (FR) algorithms are based on cosine similarity with softmax classification. Despite its reliable classification performance, this method struggles with hard samples. A popular strategy to improve FR performance is incorporating angular or cosine margins. However, it does not take face quality or recognition hardness into account, simply increasing the margin value and thus causing an overly uniform training strategy. To address this problem, a novel loss function is proposed, named Loss function for Hard High-quality Face (LH2Face). Firstly, a similarity measure based on the von Mises-Fisher (vMF) distribution is stated, specifically focusing on the logarithm of the Probability Density Function (PDF), which represents the distance between a probability distribution and a vector. Then, an adaptive margin-based multi-classification method using softmax, called the Uncertainty-Aware Margin Function, is implemented in the article. Furthermore, proxy-based loss functions are used to apply extra constraints between the proxy and sample to optimize their representation space distribution. Finally, a renderer is constructed that optimizes FR through face reconstruction and vice versa. Our LH2Face is superior to similiar schemes on hard high-quality face datasets, achieving 49.39% accuracy on the IJB-B dataset, which surpasses the second-place method by 2.37%.
