Table of Contents
Fetching ...

L2-constrained Softmax Loss for Discriminative Face Verification

Rajeev Ranjan, Carlos D. Castillo, Rama Chellappa

TL;DR

Facing unconstrained face verification, the paper identifies that softmax loss does not maximize the margin between positive and negative pairs and that feature norms correlate with image quality. It introduces L2-softmax, which constrains feature norms to a fixed radius $α$ using an $L_{2}$-normalize layer and a scale layer, preserving end-to-end training. The approach yields consistent improvements across LFW, YouTube-Style Face (YTF), and IJB-A, setting state-of-the-art on IJB-A with TAR $0.909$ at FAR $0.0001$, and achieving $99.78\%$ on LFW and $96.08\%$ on YTF with RX101. These gains are complementary to metric-learning methods and auxiliary losses, and the method remains easy to integrate with different DCNN architectures, suggesting strong practical impact for discriminative face verification.

Abstract

In recent years, the performance of face verification systems has significantly improved using deep convolutional neural networks (DCNNs). A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we add an L2-constraint to the feature descriptors which restricts them to lie on a hypersphere of a fixed radius. This module can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly boosts the performance of face verification. Specifically, we achieve state-of-the-art results on the challenging IJB-A dataset, achieving True Accept Rate of 0.909 at False Accept Rate 0.0001 on the face verification protocol. Additionally, we achieve state-of-the-art performance on LFW dataset with an accuracy of 99.78%, and competing performance on YTF dataset with accuracy of 96.08%.

L2-constrained Softmax Loss for Discriminative Face Verification

TL;DR

Facing unconstrained face verification, the paper identifies that softmax loss does not maximize the margin between positive and negative pairs and that feature norms correlate with image quality. It introduces L2-softmax, which constrains feature norms to a fixed radius using an -normalize layer and a scale layer, preserving end-to-end training. The approach yields consistent improvements across LFW, YouTube-Style Face (YTF), and IJB-A, setting state-of-the-art on IJB-A with TAR at FAR , and achieving on LFW and on YTF with RX101. These gains are complementary to metric-learning methods and auxiliary losses, and the method remains easy to integrate with different DCNN architectures, suggesting strong practical impact for discriminative face verification.

Abstract

In recent years, the performance of face verification systems has significantly improved using deep convolutional neural networks (DCNNs). A typical pipeline for face verification includes training a deep network for subject classification with softmax loss, using the penultimate layer output as the feature descriptor, and generating a cosine similarity score given a pair of face images. The softmax loss function does not optimize the features to have higher similarity score for positive pairs and lower similarity score for negative pairs, which leads to a performance gap. In this paper, we add an L2-constraint to the feature descriptors which restricts them to lie on a hypersphere of a fixed radius. This module can be easily implemented using existing deep learning frameworks. We show that integrating this simple step in the training pipeline significantly boosts the performance of face verification. Specifically, we achieve state-of-the-art results on the challenging IJB-A dataset, achieving True Accept Rate of 0.909 at False Accept Rate 0.0001 on the face verification protocol. Additionally, we achieve state-of-the-art performance on LFW dataset with an accuracy of 99.78%, and competing performance on YTF dataset with accuracy of 96.08%.

Paper Structure

This paper contains 16 sections, 9 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: (a) Face Verification Performance on IJB-A dataset. The templates are divided into $3$ sets based on their $L_{2}$-norm. '1' denotes the set with low $L_{2}$-norm while '3' represents high $L_{2}$-norm. The legend 'x-y' denote the evaluation pairs where one template is from set 'x' and another from set 'y'. (b) Sample template images from IJB-A dataset with high, medium and low L2-norm
  • Figure 2: A general pipeline for training and testing a face verification system using DCNN.
  • Figure 3: Vizualization of $2$-dimensional features for MNIST digit classification test set using (a) Softmax Loss. (b) L2-Softmax Loss
  • Figure 4: We add an $L_{2}$-normalize layer and a scale layer to constrain the feature descriptor to lie on a hypersphere of radius $\alpha$.
  • Figure 5: (a) $2$-D vizualization of the assumed distribution of features (b) Variation in Softmax probability with respect to $\alpha$ for different number of classes $C$
  • ...and 4 more figures