Table of Contents
Fetching ...

Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment

Baoyun Peng, Min Liu, Zhaoning Zhang, Kai Xu, Dongsheng Li

TL;DR

The paper addresses recognition instability from low-quality face images in video sequences by introducing a recognition-oriented, non-reference FIQA method. It defines a quality metric via quality(x) = cosine(f_x, u_y) that directly links image quality to FR performance, and develops tinyFQnet, an ultra-efficient 21.8k-parameter network trained to predict this quality score. The approach enables automatic generation of large-scale quality labels and includes a data-balancing strategy to improve training effectiveness. Extensive experiments on IJB-B, IJB-C, and YTF show that tinyFQnet outperforms perceptual FIQA baselines and several learning-based methods while requiring far less computation, making it well-suited as a plug-in for resource-constrained FR systems. Limitations include reliance on cosine-based similarity and the need to validate generalizability across more diverse datasets and metrics, suggesting avenues for future work.

Abstract

Face recognition has made significant progress in recent years due to deep convolutional neural networks (CNN). In many face recognition (FR) scenarios, face images are acquired from a sequence with huge intra-variations. These intra-variations, which are mainly affected by the low-quality face images, cause instability of recognition performance. Previous works have focused on ad-hoc methods to select frames from a video or use face image quality assessment (FIQA) methods, which consider only a particular or combination of several distortions. In this work, we present an efficient non-reference image quality assessment for FR that directly links image quality assessment (IQA) and FR. More specifically, we propose a new measurement to evaluate image quality without any reference. Based on the proposed quality measurement, we propose a deep Tiny Face Quality network (tinyFQnet) to learn a quality prediction function from data. We evaluate the proposed method for different powerful FR models on two classical video-based (or template-based) benchmark: IJB-B and YTF. Extensive experiments show that, although the tinyFQnet is much smaller than the others, the proposed method outperforms state-of-the-art quality assessment methods in terms of effectiveness and efficiency.

Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment

TL;DR

The paper addresses recognition instability from low-quality face images in video sequences by introducing a recognition-oriented, non-reference FIQA method. It defines a quality metric via quality(x) = cosine(f_x, u_y) that directly links image quality to FR performance, and develops tinyFQnet, an ultra-efficient 21.8k-parameter network trained to predict this quality score. The approach enables automatic generation of large-scale quality labels and includes a data-balancing strategy to improve training effectiveness. Extensive experiments on IJB-B, IJB-C, and YTF show that tinyFQnet outperforms perceptual FIQA baselines and several learning-based methods while requiring far less computation, making it well-suited as a plug-in for resource-constrained FR systems. Limitations include reliance on cosine-based similarity and the need to validate generalizability across more diverse datasets and metrics, suggesting avenues for future work.

Abstract

Face recognition has made significant progress in recent years due to deep convolutional neural networks (CNN). In many face recognition (FR) scenarios, face images are acquired from a sequence with huge intra-variations. These intra-variations, which are mainly affected by the low-quality face images, cause instability of recognition performance. Previous works have focused on ad-hoc methods to select frames from a video or use face image quality assessment (FIQA) methods, which consider only a particular or combination of several distortions. In this work, we present an efficient non-reference image quality assessment for FR that directly links image quality assessment (IQA) and FR. More specifically, we propose a new measurement to evaluate image quality without any reference. Based on the proposed quality measurement, we propose a deep Tiny Face Quality network (tinyFQnet) to learn a quality prediction function from data. We evaluate the proposed method for different powerful FR models on two classical video-based (or template-based) benchmark: IJB-B and YTF. Extensive experiments show that, although the tinyFQnet is much smaller than the others, the proposed method outperforms state-of-the-art quality assessment methods in terms of effectiveness and efficiency.

Paper Structure

This paper contains 21 sections, 2 equations, 7 figures, 8 tables, 1 algorithm.

Figures (7)

  • Figure 1: A typical selection process. Random selection may select a low-quality image that is hard for the FR model to recognize. A good FIQA selection method can improve the performance of recognition.
  • Figure 2: The pipeline for video-based FR. The base set contains multiple identities, and each identity is with several images. In common, the frames are captured under significant variations, such as large head pose, illumination, motion blur, and occlusion. Our method consists of four steps. In Step 1, we train the FR network. Then, the trained FR network is used to label face images with quality. We use L2 regression loss in Step 3 to train a FIQA network. For testing in Step 4, we first use our trained FIQA network to select high-quality face images and feed them to our trained FR network to extract features, then the extracted feature will be used for face verification.
  • Figure 3: Geometry interpretation of recognition-oriented quality metric in a 2D feature embedding space. $\bm{W}_{y_i}$ and $\bm{W}_{y_j}$ are the centers of $y_i$ and $y_j$ class, $\bm{f}_i$ and $\bm{f}_j$ is the feature vector of $\bm{x}_i$ and $\bm{x}_j$, $\theta_i$ and $\theta_j$ is the angle between $\bm{f}_i$ & $\bm{W}_{y_i}$ and $\bm{f}_j$ & $\bm{W}_{y_j}$ , respectively.
  • Figure 4: The details of two basic blocks in tinyFQNet. (a): non-residual block; (b): residual block.
  • Figure 5: Distribution of quality scores under different data sizes, including large size(left column), middle size(middle column), and small size(right column). The dataset is Ms-Celeb-1M guo2016ms, and the FR model is R50. The top row shows the results of a random sampling strategy, and the bottom row shows the results of a smooth sampling strategy.
  • ...and 2 more figures