Model Compression Techniques in Biometrics Applications: A Survey
Eduarda Caldeira, Pedro C. Neto, Marco Huber, Naser Damer, Ana F. Sequeira
TL;DR
This survey analyzes model compression techniques in biometrics, focusing on quantization, knowledge distillation (KD), and pruning. It systematizes the literature, comparing biometrics-specific work to broader computer vision findings, and highlights how compression can impact bias and fairness. The authors discuss methodological trade-offs (PTQ vs. QAT, RW-KD vs. FB-KD, LW/CW pruning) and emphasize the potential and limits of combining techniques to achieve edge-friendly biometrics with acceptable accuracy. They also review evidence that compression can worsen disparities across demographic groups, urging the development of fairness-aware compression methods and balanced evaluation frameworks. Overall, the work maps current progress, identifies gaps (especially for quantization and pruning in biometrics), and proposes future directions toward fairer, more efficient biometric models.
Abstract
The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity, limiting their usefulness in human-oriented applications, which are usually deployed in resource-constrained devices. This led to the development of compression techniques that drastically reduce the computational and memory costs of deep learning models without significant performance degradation. This paper aims to systematize the current literature on this topic by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning. We conduct a critical analysis of the comparative value of these techniques, focusing on their advantages and disadvantages and presenting suggestions for future work directions that can potentially improve the current methods. Additionally, we discuss and analyze the link between model bias and model compression, highlighting the need to direct compression research toward model fairness in future works.
