Learning Face Representation from Scratch
Dong Yi, Zhen Lei, Shengcai Liao, Stan Z. Li
TL;DR
The authors address the scarcity of public large-scale training data for face recognition by introducing CASIA-WebFace, a semi-automatically collected IMDb-based dataset with 10,575 subjects and 494,414 faces. They train an 11-layer CNN using a joint identification and verification loss to learn a compact, discriminative 320-dim face representation. Evaluations on LFW and YouTube Faces show strong performance, with a single network achieving competitive or superior results to some ensemble methods and the BLUFR protocol highlighting robustness at low false-alarm rates. This work provides a public benchmark to standardize evaluation and accelerate progress in face recognition in the wild.
Abstract
Pushing by big data and deep convolutional neural network (CNN), the performance of face recognition is becoming comparable to human. Using private large scale training datasets, several groups achieve very high performance on LFW, i.e., 97% to 99%. While there are many open source implementations of CNN, none of large scale face dataset is publicly available. The current situation in the field of face recognition is that data is more important than algorithm. To solve this problem, this paper proposes a semi-automatical way to collect face images from Internet and builds a large scale dataset containing about 10,000 subjects and 500,000 images, called CASIAWebFace. Based on the database, we use a 11-layer CNN to learn discriminative representation and obtain state-of-theart accuracy on LFW and YTF. The publication of CASIAWebFace will attract more research groups entering this field and accelerate the development of face recognition in the wild.
