GazeTrak: Exploring Acoustic-based Eye Tracking on a Glass Frame
Ke Li, Ruidong Zhang, Boao Chen, Siyuan Chen, Sicheng Yin, Saif Mahmud, Qikang Liang, François Guimbretière, Cheng Zhang
TL;DR
GazeTrak presents the first acoustic-based gaze tracking system integrated into glasses, using two speakers and eight microphones to emit inaudible FMCW signals and extract echo profiles for gaze inference via a ResNet-18–based model. The approach achieves a cross-session MGAE of around $4.9^ ext{$\circ$}$ and an in-session MGAE of $3.6^ ext{$\circ$}$ at 83.3 Hz, with a total power footprint of ~287.9 mW on a portable platform, and ~95.4 mW at 30 Hz when deployed on MAX78002 with optimized processing. A ground-truth calibration method using on-screen instruction points enables training without external trackers, and an MCU-based real-time pipeline demonstrates feasible deployment on a low-power device. The work highlights strong potential for long-duration, privacy-preserving gaze tracking in wearable contexts, while also outlining calibration, integration, and environmental robustness challenges for future work.
Abstract
In this paper, we present GazeTrak, the first acoustic-based eye tracking system on glasses. Our system only needs one speaker and four microphones attached to each side of the glasses. These acoustic sensors capture the formations of the eyeballs and the surrounding areas by emitting encoded inaudible sound towards eyeballs and receiving the reflected signals. These reflected signals are further processed to calculate the echo profiles, which are fed to a customized deep learning pipeline to continuously infer the gaze position. In a user study with 20 participants, GazeTrak achieves an accuracy of 3.6° within the same remounting session and 4.9° across different sessions with a refreshing rate of 83.3 Hz and a power signature of 287.9 mW. Furthermore, we report the performance of our gaze tracking system fully implemented on an MCU with a low-power CNN accelerator (MAX78002). In this configuration, the system runs at up to 83.3 Hz and has a total power signature of 95.4 mW with a 30 Hz FPS.
