GazeTrack: High-Precision Eye Tracking Based on Regularization and Spatial Computing
Xiaoyin Yang
TL;DR
This work targets high-precision gaze tracking for spatial computing by introducing GazeTrack, a framework comprising a high-quality dataset and three core components: U-ResAtt for shape-regularized pupil segmentation, CoordTransNet for elastic coordinate transformation, and GVnet for efficient gaze-vector generation. The approach yields substantial improvements in pupil segmentation accuracy, coordinate normalization, and gaze-vector estimation, while maintaining lower computational complexity than prior methods. The GazeTrack dataset, with diverse subjects and multi-angle imagery, underpins robust evaluation and supports reproducibility. Overall, the paper pushes toward real-time, accurate gaze tracking suitable for VR/AR spatial computing applications.
Abstract
Eye tracking has become increasingly important in virtual and augmented reality applications; however, the current gaze accuracy falls short of meeting the requirements for spatial computing. We designed a gaze collection framework and utilized high-precision equipment to gather the first precise benchmark dataset, GazeTrack, encompassing diverse ethnicities, ages, and visual acuity conditions for pupil localization and gaze tracking. We propose a novel shape error regularization method to constrain pupil ellipse fitting and train on open-source datasets, enhancing semantic segmentation and pupil position prediction accuracy. Additionally, we invent a novel coordinate transformation method similar to paper unfolding to accurately predict gaze vectors on the GazeTrack dataset. Finally, we built a gaze vector generation model that achieves reduced gaze angle error with lower computational complexity compared to other methods.
