Large-Margin Softmax Loss for Convolutional Neural Networks
Weiyang Liu, Yandong Wen, Zhiding Yu, Meng Yang
TL;DR
The paper presents Large-Margin Softmax (L-Softmax), a margin-based extension of the softmax loss that enforces larger angular separation between class decision boundaries. By replacing the ground-truth cosine term with a margin-controlled function psi(theta) parameterized by an integer m, L-Softmax yields increased intra-class compactness and inter-class separability. The authors provide a geometric interpretation, an SGD-friendly optimization framework, and extensive experiments on MNIST, CIFAR-10/100, and LFW demonstrating improved discriminative features and verification performance. As a drop-in replacement for softmax, L-Softmax offers adjustable angular margins to enhance CNN-based recognition and verification tasks without substantial overhead.
Abstract
Cross-entropy loss together with softmax is arguably one of the most common used supervision components in convolutional neural networks (CNNs). Despite its simplicity, popularity and excellent performance, the component does not explicitly encourage discriminative learning of features. In this paper, we propose a generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features. Moreover, L-Softmax not only can adjust the desired margin but also can avoid overfitting. We also show that the L-Softmax loss can be optimized by typical stochastic gradient descent. Extensive experiments on four benchmark datasets demonstrate that the deeply-learned features with L-softmax loss become more discriminative, hence significantly boosting the performance on a variety of visual classification and verification tasks.
