Table of Contents
Fetching ...

Model Compression Techniques in Biometrics Applications: A Survey

Eduarda Caldeira, Pedro C. Neto, Marco Huber, Naser Damer, Ana F. Sequeira

TL;DR

This survey analyzes model compression techniques in biometrics, focusing on quantization, knowledge distillation (KD), and pruning. It systematizes the literature, comparing biometrics-specific work to broader computer vision findings, and highlights how compression can impact bias and fairness. The authors discuss methodological trade-offs (PTQ vs. QAT, RW-KD vs. FB-KD, LW/CW pruning) and emphasize the potential and limits of combining techniques to achieve edge-friendly biometrics with acceptable accuracy. They also review evidence that compression can worsen disparities across demographic groups, urging the development of fairness-aware compression methods and balanced evaluation frameworks. Overall, the work maps current progress, identifies gaps (especially for quantization and pruning in biometrics), and proposes future directions toward fairer, more efficient biometric models.

Abstract

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.

Model Compression Techniques in Biometrics Applications: A Survey

TL;DR

This survey analyzes model compression techniques in biometrics, focusing on quantization, knowledge distillation (KD), and pruning. It systematizes the literature, comparing biometrics-specific work to broader computer vision findings, and highlights how compression can impact bias and fairness. The authors discuss methodological trade-offs (PTQ vs. QAT, RW-KD vs. FB-KD, LW/CW pruning) and emphasize the potential and limits of combining techniques to achieve edge-friendly biometrics with acceptable accuracy. They also review evidence that compression can worsen disparities across demographic groups, urging the development of fairness-aware compression methods and balanced evaluation frameworks. Overall, the work maps current progress, identifies gaps (especially for quantization and pruning in biometrics), and proposes future directions toward fairer, more efficient biometric models.

Abstract

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.
Paper Structure (26 sections, 14 equations, 3 figures, 3 tables)

This paper contains 26 sections, 14 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Quantization framework of a signed uniform affine quantizer. The top and bottom axes represent the continuous FP and the discrete quantized spectrums, respectively. The orange and green arrows represent the effects of the rounding and clipping functions (Equations \ref{['eq:quant1']} and \ref{['eq:quant2']}), respectively.
  • Figure 2: Knowledge distillation framework consisting of RB-KD and FB-KD terms ($L_{RB-KD}$ and $L_{FB-KD}$, respectively). These terms are weighted by the hyperparameter $\lambda$, resulting in the student's loss, $L_{student}$.
  • Figure 3: Pruning framework based on the traditional $L_1$-norm. The original network (top) is pruned in a layer-wise fashion to a sparsity of 50%; the pruned weights are marked in red.