Suppressing Uncertainty in Gaze Estimation
Shijing Wang, Yaping Huang
TL;DR
We address the problem of data uncertainty in gaze estimation arising from low-quality images and mislabelled points. Our approach, SUGE, introduces a triplet-label consistency framework built on neighboring labeling, uncertainty metrics, and Gaussian Mixture Model confidences to drive label correction and sample weighting, with a co-training setup to curb self-training bias. The method achieves state-of-the-art performance on EyeDiap, MPIIFaceGaze, Gaze360, and ETH-XGaze-driven tasks by effectively suppressing unreliable data during training. This work highlights the importance of attending to data quality in gaze systems and provides a practical framework to improve robustness in real-world datasets.
Abstract
Uncertainty in gaze estimation manifests in two aspects: 1) low-quality images caused by occlusion, blurriness, inconsistent eye movements, or even non-face images; 2) incorrect labels resulting from the misalignment between the labeled and actual gaze points during the annotation process. Allowing these uncertainties to participate in training hinders the improvement of gaze estimation. To tackle these challenges, in this paper, we propose an effective solution, named Suppressing Uncertainty in Gaze Estimation (SUGE), which introduces a novel triplet-label consistency measurement to estimate and reduce the uncertainties. Specifically, for each training sample, we propose to estimate a novel ``neighboring label'' calculated by a linearly weighted projection from the neighbors to capture the similarity relationship between image features and their corresponding labels, which can be incorporated with the predicted pseudo label and ground-truth label for uncertainty estimation. By modeling such triplet-label consistency, we can measure the qualities of both images and labels, and further largely reduce the negative effects of unqualified images and wrong labels through our designed sample weighting and label correction strategies. Experimental results on the gaze estimation benchmarks indicate that our proposed SUGE achieves state-of-the-art performance.
