ColFigPhotoAttnNet: Reliable Finger Photo Presentation Attack Detection Leveraging Window-Attention on Color Spaces
Anudeep Vurity, Emanuela Marasco, Raghavendra Ramachandra, Jongwoo Park
TL;DR
This work tackles the cross-device robustness problem in finger photo presentation attack detection by introducing ColFigPhotoAttnNet, a hybrid architecture that processes RGB, HSV, and YCbCr color spaces in parallel via MobileNet V3 backbones. It employs window-based self-attention within 7×7 local regions to capture localized color-space relationships, followed by a Nested Residual Block predictor and 8-bit dynamic quantization for mobile deployment. The framework is evaluated on three finger-photo datasets across inter- and intra-capture settings, demonstrating superior generalization compared to state-of-the-art CNNs and transformers, and revealing the benefits of multi-color-space fusion alongside the trade-offs introduced by quantization. The results emphasize the impact of capture-device evolution on PAD performance and highlight ColFigPhotoAttnNet as a practical, efficient solution for robust, device-agnostic finger photo PAD in real-world mobile security contexts.
Abstract
Finger photo Presentation Attack Detection (PAD) can significantly strengthen smartphone device security. However, these algorithms are trained to detect certain types of attacks. Furthermore, they are designed to operate on images acquired by specific capture devices, leading to poor generalization and a lack of robustness in handling the evolving nature of mobile hardware. The proposed investigation is the first to systematically analyze the performance degradation of existing deep learning PAD systems, convolutional and transformers, in cross-capture device settings. In this paper, we introduce the ColFigPhotoAttnNet architecture designed based on window attention on color channels, followed by the nested residual network as the predictor to achieve a reliable PAD. Extensive experiments using various capture devices, including iPhone13 Pro, GooglePixel 3, Nokia C5, and OnePlusOne, were carried out to evaluate the performance of proposed and existing methods on three publicly available databases. The findings underscore the effectiveness of our approach.
