Deep Learning-based Compression Detection for explainable Face Image Quality Assessment
Laurin Jonientz, Johannes Merkle, Christian Rathgeb, Benjamin Tams, Georg Merz
TL;DR
The paper tackles the need for explainable face image quality assessment by detecting compression artefacts that degrade recognition performance. It trains EfficientNetV2-B0 regression networks using PSNR- and SSIM-based labels derived from artefact-free originals compressed with JPEG and JPEG 2000, leveraging a large synthetic dataset to learn artefact severity. The authors demonstrate low detection error rates (e.g., $EER$ around $2\%$–$3.4\%$) and strong correlations with compression strength ($\rho$ ≈ $0.89$–$0.94$), plus improved biometric performance when severely compressed images are discarded, across both open-source and commercial systems. Integrated into the open OFIQ software, the approach provides an explainable, actionable component for face image quality assessment with potential for separate-format detectors and score fusion in future work.
Abstract
The assessment of face image quality is crucial to ensure reliable face recognition. In order to provide data subjects and operators with explainable and actionable feedback regarding captured face images, relevant quality components have to be measured. Quality components that are known to negatively impact the utility of face images include JPEG and JPEG 2000 compression artefacts, among others. Compression can result in a loss of important image details which may impair the recognition performance. In this work, deep neural networks are trained to detect the compression artefacts in a face images. For this purpose, artefact-free facial images are compressed with the JPEG and JPEG 2000 compression algorithms. Subsequently, the PSNR and SSIM metrics are employed to obtain training labels based on which neural networks are trained using a single network to detect JPEG and JPEG 2000 artefacts, respectively. The evaluation of the proposed method shows promising results: in terms of detection accuracy, error rates of 2-3% are obtained for utilizing PSNR labels during training. In addition, we show that error rates of different open-source and commercial face recognition systems can be significantly reduced by discarding face images exhibiting severe compression artefacts. To minimize resource consumption, EfficientNetV2 serves as basis for the presented algorithm, which is available as part of the OFIQ software.
