AuthGlass: Benchmarking Voice Liveness Detection and Authentication on Smart Glasses via Comprehensive Acoustic Features
Weiye Xu, Zhang Jiang, Siqi Zheng, Xiyuxing Zhang, Changhao Zhang, Jian Liu, Weiqiang Wang, Yuntao Wang
TL;DR
AuthGlass tackles the security gap in voice-based interaction on smart glasses by introducing a public, high-resolution, multi-channel dataset and hardware platform. It presents AuthG-Live, a sound-field based liveness detector, and AuthG-Net, a multi-acoustic-modal authentication model that fuses AC, BC, and SF cues for robust user verification. Across four benchmark tasks, the approach achieves state-of-the-art performance and demonstrates strong generalization to unseen attacks and cross-utterance scenarios, with ablations showing resilience under reduced modalities and commercial-device configurations. The work provides practical design insights for microphone layout and enables broad future research through open data and hardware resources.
Abstract
With the rapid advancement of smart glasses, voice interaction has been widely adopted due to its naturalness and convenience. However, its practical deployment is often undermined by vulnerability to spoofing attacks, while no public dataset currently exists for voice liveness detection and authentication in smart-glasses scenarios. To address this challenge, we first collect a multi-acoustic-modal dataset comprising 16-channel audio data from 42 subjects, along with corresponding attack samples covering two attack categories. Based on insights derived from this collected data, we propose AuthG-Live, a sound-field-based voice liveness detection method, and AuthG-Net, a multi-acoustic-modal authentication model. We further benchmark seven voice liveness detection methods and four authentication methods across diverse acoustic modalities. The results demonstrate that our proposed approach achieves state-of-the-art performance on four benchmark tasks, and extensive ablation studies validate the generalizability of our methods across different modality combinations. Finally, we release this dataset, termed AuthGlass, to facilitate future research on voice liveness detection and authentication for smart glasses.
