Table of Contents
Fetching ...

Fairness measures for biometric quality assessment

André Dörsch, Torsten Schlett, Peter Munch, Christian Rathgeb, Christoph Busch

TL;DR

The paper tackles fairness in biometric quality assessment by formalizing differential performance measures (DPM) across demographic groups. It compares GC-based fairness metrics with new variants, including Cubed Sample Quality Fairness Rate (CSQFR), Low-Weighted-Mean (LWM) scores, and Mean-Discard-Gap (MDG), detailing their mathematical formulations. Key findings show GC-based measures can miss outliers or bias in distributions, while CSQFR, LWM-SQFR, and especially CSQFR variants better reveal biased scenarios and worst-case gaps (MDG-SQFR). The work proposes these measures as potential candidates for standardization and emphasizes evaluating them on field data to guide fair FIQA implementations.

Abstract

Quality assessment algorithms measure the quality of a captured biometric sample. Since the sample quality strongly affects the recognition performance of a biometric system, it is essential to only process samples of sufficient quality and discard samples of low-quality. Even though quality assessment algorithms are not intended to yield very different quality scores across demographic groups, quality score discrepancies are possible, resulting in different discard ratios. To ensure that quality assessment algorithms do not take demographic characteristics into account when assessing sample quality and consequently to ensure that the quality algorithms perform equally for all individuals, it is crucial to develop a fairness measure. In this work we propose and compare multiple fairness measures for evaluating quality components across demographic groups. Proposed measures, could be used as potential candidates for an upcoming standard in this important field.

Fairness measures for biometric quality assessment

TL;DR

The paper tackles fairness in biometric quality assessment by formalizing differential performance measures (DPM) across demographic groups. It compares GC-based fairness metrics with new variants, including Cubed Sample Quality Fairness Rate (CSQFR), Low-Weighted-Mean (LWM) scores, and Mean-Discard-Gap (MDG), detailing their mathematical formulations. Key findings show GC-based measures can miss outliers or bias in distributions, while CSQFR, LWM-SQFR, and especially CSQFR variants better reveal biased scenarios and worst-case gaps (MDG-SQFR). The work proposes these measures as potential candidates for standardization and emphasizes evaluating them on field data to guide fair FIQA implementations.

Abstract

Quality assessment algorithms measure the quality of a captured biometric sample. Since the sample quality strongly affects the recognition performance of a biometric system, it is essential to only process samples of sufficient quality and discard samples of low-quality. Even though quality assessment algorithms are not intended to yield very different quality scores across demographic groups, quality score discrepancies are possible, resulting in different discard ratios. To ensure that quality assessment algorithms do not take demographic characteristics into account when assessing sample quality and consequently to ensure that the quality algorithms perform equally for all individuals, it is crucial to develop a fairness measure. In this work we propose and compare multiple fairness measures for evaluating quality components across demographic groups. Proposed measures, could be used as potential candidates for an upcoming standard in this important field.
Paper Structure (8 sections, 8 equations, 5 figures, 9 tables)

This paper contains 8 sections, 8 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Various examples of face image defects (i.e. factors) of a captured sample that negatively impact the recognition performance. As a result, the images shown are not compliant with requirements formulated in ISO/IEC 39794-5 ISO-IEC-39794-5-G3-FaceImage-191015. Facial images taken from ISO-IEC-39794-5-G3-FaceImage-191015.
  • Figure 2: Fictitious quality component $Q_1$ (Slightly biased): KDE Plot of the demographic score distribution
  • Figure 3: Fictitious quality component $Q_2$ (Strongly biased): KDE Plot of the demographic score distribution
  • Figure 4: Fictitious quality component $Q_3$: KDE Plot of the demographic score distribution
  • Figure 5: Fictitious quality component $Q_5$: KDE Plot of the demographic score distribution