Table of Contents
Fetching ...

Confidence-aware 3D Gaze Estimation and Evaluation Metric

Qiaojie Zheng, Xiaoli Zhang

TL;DR

This work tackles unreliable and overconfident appearance-based 3D gaze estimation by introducing a confidence-aware network that outputs both gaze angles and their uncertainties. It employs a heteroskedastic loss and an end-to-end architecture built on Resnet18 per eye, with the overall uncertainty taken as the maximum of the two per-angle uncertainties. To evaluate uncertainty effectiveness, the authors propose a causal correlation metric that links eye feature degradation (via controllable corruptions) to inferred uncertainty, showing superior discrimination relative to traditional angular-error correlations. Experimental results on MPII-Gaze and RTGene demonstrate strong per-corruption uncertainty signaling (e.g., correlation $C\approx0.95$) and robust cross-dataset behavior, suggesting practical benefits for safer HMI deployment where unreliable gaze estimates can be flagged or discarded before action. The proposed framework advances both gaze estimation and uncertainty evaluation, enabling more trustworthy, real-time gaze-enabled interactions in variable visual conditions.

Abstract

Deep learning appearance-based 3D gaze estimation is gaining popularity due to its minimal hardware requirements and being free of constraint. Unreliable and overconfident inferences, however, still limit the adoption of this gaze estimation method. To address the unreliable and overconfident issues, we introduce a confidence-aware model that predicts uncertainties together with gaze angle estimations. We also introduce a novel effectiveness evaluation method based on the causality between eye feature degradation and the rise in inference uncertainty to assess the uncertainty estimation. Our confidence-aware model demonstrates reliable uncertainty estimations while providing angular estimation accuracies on par with the state-of-the-art. Compared with the existing statistical uncertainty-angular-error evaluation metric, the proposed effectiveness evaluation approach can more effectively judge inferred uncertainties' performance at each prediction.

Confidence-aware 3D Gaze Estimation and Evaluation Metric

TL;DR

This work tackles unreliable and overconfident appearance-based 3D gaze estimation by introducing a confidence-aware network that outputs both gaze angles and their uncertainties. It employs a heteroskedastic loss and an end-to-end architecture built on Resnet18 per eye, with the overall uncertainty taken as the maximum of the two per-angle uncertainties. To evaluate uncertainty effectiveness, the authors propose a causal correlation metric that links eye feature degradation (via controllable corruptions) to inferred uncertainty, showing superior discrimination relative to traditional angular-error correlations. Experimental results on MPII-Gaze and RTGene demonstrate strong per-corruption uncertainty signaling (e.g., correlation ) and robust cross-dataset behavior, suggesting practical benefits for safer HMI deployment where unreliable gaze estimates can be flagged or discarded before action. The proposed framework advances both gaze estimation and uncertainty evaluation, enabling more trustworthy, real-time gaze-enabled interactions in variable visual conditions.

Abstract

Deep learning appearance-based 3D gaze estimation is gaining popularity due to its minimal hardware requirements and being free of constraint. Unreliable and overconfident inferences, however, still limit the adoption of this gaze estimation method. To address the unreliable and overconfident issues, we introduce a confidence-aware model that predicts uncertainties together with gaze angle estimations. We also introduce a novel effectiveness evaluation method based on the causality between eye feature degradation and the rise in inference uncertainty to assess the uncertainty estimation. Our confidence-aware model demonstrates reliable uncertainty estimations while providing angular estimation accuracies on par with the state-of-the-art. Compared with the existing statistical uncertainty-angular-error evaluation metric, the proposed effectiveness evaluation approach can more effectively judge inferred uncertainties' performance at each prediction.
Paper Structure (26 sections, 3 equations, 12 figures, 1 table)

This paper contains 26 sections, 3 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: The proposed confidence-aware model (top) and the uncertainty effectiveness evaluation approach (bottom). Our model learns to judge the prediction confidence based on eye feature quality in the input images with our proposed loss function. Inference uncertainties are produced together with gaze angle estimates. Our uncertainty effectiveness assessment is based on the asserted causality between eye feature degradation and inference uncertainty. We assess the effectiveness based on the correlation strength between the inferred uncertainty and the severity of intentionally introduced corruptions used to achieve different levels of eye feature degradation.
  • Figure 2: Network structure adapted from Fischer2018 for confidence-aware 3D gaze estimation. This network outputs uncertainty values for pitch and yaw angles, respectively. The maximum between the pitch uncertainty and yaw uncertainty represents the overall inference uncertainty.
  • Figure 3: Proposed procedure to evaluate effectiveness in uncertainty estimation. Intentional corruption with controllable severities is introduced to clean images for the confidence-aware model to infer. The model's estimated uncertainties are compared with the corruption severity levels to find effectiveness.
  • Figure 4: Visualization of 14 image corruptions. The top left shows the uncorrupted image for reference. These corruptions are adopted from ImageNet-C Hendrycks2019 Elastic transform corruption is not used because of its inconsistent behavior with the corruption severity.
  • Figure 5: Custom implementation of off-cropping image corruption with 5 levels of severity. The leftmost column contains uncorrupted images. The most severe off-cropping completely crops eye features by moving the crop center by the width or height of the patch. The rest 4 severities are spaced evenly by the crop center distance between no off-crop and the most severe one.
  • ...and 7 more figures