Normalized Validity Scores for DNNs in Regression based Eye Feature Extraction
Wolfgang Fuhl
TL;DR
This work addresses inaccuracies in regression-based landmark detection for eye feature extraction by introducing a normalized joint inaccuracy loss and a gradient-margin mechanism. By adding an per-landmark inaccuracy output and normalizing the error term with respect to the overall shape area, the method stabilizes training and enables effective outlier exclusion through the inaccuracy signal. Empirical results on real-world TEyEDS-derived data show consistent improvements in MIoU and MED across Pupil, Iris, and Eyelid landmarks, with Euclidean-based inaccuracy often performing best, and margin tuning (0.005) proving crucial. The approach enhances both landmark estimation and corrective downstream shape extraction, with implications for more robust eye-tracking pipelines, while acknowledging limitations and potential societal risks.
Abstract
We propose an improvement to the landmark validity loss. Landmark detection is widely used in head pose estimation, eyelid shape extraction, as well as pupil and iris segmentation. There are numerous additional applications where landmark detection is used to estimate the shape of complex objects. One part of this process is the accurate and fine-grained detection of the shape. The other part is the validity or inaccuracy per landmark, which can be used to detect unreliable areas, where the shape possibly does not fit, and to improve the accuracy of the entire shape extraction by excluding inaccurate landmarks. We propose a normalization in the loss formulation, which improves the accuracy of the entire approach due to the numerical balance of the normalized inaccuracy. In addition, we propose a margin for the inaccuracy to reduce the impact of gradients, which are produced by negligible errors close to the ground truth.
