Non-verbal Hands-free Control for Smart Glasses using Teeth Clicks
Payal Mohapatra, Ali Aroudi, Anurag Kumar, Morteza Khaleghimeybodi
TL;DR
This paper addresses discreet, hands-free control for smart glasses by detecting two teeth-click patterns from nose-pad accelerometer signals. It introduces STEALTHsense, a lightweight temporal-broadcasting neural network with approximately $88\mathrm{K}$ parameters and $7.14\mathrm{M}$ MMAC, trained on a dataset of 21 participants to achieve a cross-subject balanced accuracy of $0.93$ under noisy conditions. The approach uses a tailored data augmentation, a 41-feature time-frequency representation, and a robust on-device inference pipeline, demonstrating strong performance and real-time feasibility. Field tests report positive user adoption and perceived accuracy, highlighting the practical potential for unobtrusive interaction in AR glasses. Overall, the work delivers a compact, noise-robust solution for non-verbal human-computer interaction with smart glasses and outlines clear paths for personalization and broader usability.
Abstract
Smart glasses are emerging as a popular wearable computing platform potentially revolutionizing the next generation of human-computer interaction. The widespread adoption of smart glasses has created a pressing need for discreet and hands-free control methods. Traditional input techniques, such as voice commands or tactile gestures, can be intrusive and non-discreet. Additionally, voice-based control may not function well in noisy acoustic conditions. We propose a novel, discreet, non-verbal, and non-tactile approach to controlling smart glasses through subtle vibrations on the skin induced by teeth clicking. We demonstrate that these vibrations can be sensed by accelerometers embedded in the glasses with a low-footprint predictive model. Our proposed method, called STEALTHsense, utilizes a temporal broadcasting-based neural network architecture with just 88K trainable parameters and 7.14M Multiply and Accumulate (MMAC) per inference unit. We benchmark our proposed STEALTHsense against state-of-the-art deep learning approaches and traditional low-footprint machine learning approaches. We conducted a study across 21 participants to collect representative samples for two distinct teeth-clicking patterns and many non-patterns for robust training of STEALTHsense, achieving an average cross-person accuracy of 0.93. Field testing confirmed its effectiveness, even in noisy conditions, underscoring STEALTHsense's potential for real-world applications, offering a promising solution for smart glasses interaction.
