VoxAtnNet: A 3D Point Clouds Convolutional Neural Network for Generalizable Face Presentation Attack Detection
Raghavendra Ramachandra, Narayan Vetrekar, Sushma Venkatesh, Savita Nageshker, Jag Mohan Singh, R. S. Gad
TL;DR
This work tackles face presentation attack detection on smartphones by leveraging 3D point clouds captured with the front camera and preserving spatial structure through voxelization. It introduces VoxAtnNet, a 23-layer residual 3D CNN with a novel attention mechanism operating on a $64 \times 64 \times 64$ occupancy grid, achieving strong discrimination between bona fide faces and 3D PAIs. A new dataset, 3D-PCPA, comprises bona fide, 3D silicone masks, and 3D wrap photo attacks to evaluate on unseen instruments, with experiments showing VoxAtnNet superior performance and generalizability across intra/inter/both protocols. The findings highlight the feasibility and impact of 3D point-cloud-based PAD on smartphones, offering a path toward more robust, device-agnostic biometric security.
Abstract
Facial biometrics are an essential components of smartphones to ensure reliable and trustworthy authentication. However, face biometric systems are vulnerable to Presentation Attacks (PAs), and the availability of more sophisticated presentation attack instruments such as 3D silicone face masks will allow attackers to deceive face recognition systems easily. In this work, we propose a novel Presentation Attack Detection (PAD) algorithm based on 3D point clouds captured using the frontal camera of a smartphone to detect presentation attacks. The proposed PAD algorithm, VoxAtnNet, processes 3D point clouds to obtain voxelization to preserve the spatial structure. Then, the voxelized 3D samples were trained using the novel convolutional attention network to detect PAs on the smartphone. Extensive experiments were carried out on the newly constructed 3D face point cloud dataset comprising bona fide and two different 3D PAIs (3D silicone face mask and wrap photo mask), resulting in 3480 samples. The performance of the proposed method was compared with existing methods to benchmark the detection performance using three different evaluation protocols. The experimental results demonstrate the improved performance of the proposed method in detecting both known and unknown face presentation attacks.
