Precise localization of corneal reflections in eye images using deep learning trained on synthetic data
Sean Anthony Byrne, Marcus Nyström, Virmarie Maquiling, Enkelejda Kasneci, Diederick C. Niehorster
TL;DR
The paper tackles precise CR center localization in eye images without real-data annotations by training a CNN on synthetically generated CR images. It introduces a seven-convolutional-layer CNN, trained in two stages, that localizes the CR center to sub-pixel accuracy and outperforms traditional methods on real eye images. The study demonstrates that synthetic data can yield high-performance CR localization and yields modest gains in gaze precision, while acknowledging pupil-noise as a limiting factor. This approach reduces data-labeling burdens and offers a flexible, scalable path for improving CR-based gaze trackers, with open-source code and data available for replication and extension. The work also highlights future directions, including pupil localization and handling multiple CRs, to extend applicability to real-world, low-quality imaging settings.
Abstract
We present a deep learning method for accurately localizing the center of a single corneal reflection (CR) in an eye image. Unlike previous approaches, we use a convolutional neural network (CNN) that was trained solely using simulated data. Using only simulated data has the benefit of completely sidestepping the time-consuming process of manual annotation that is required for supervised training on real eye images. To systematically evaluate the accuracy of our method, we first tested it on images with simulated CRs placed on different backgrounds and embedded in varying levels of noise. Second, we tested the method on high-quality videos captured from real eyes. Our method outperformed state-of-the-art algorithmic methods on real eye images with a 35% reduction in terms of spatial precision, and performed on par with state-of-the-art on simulated images in terms of spatial accuracy.We conclude that our method provides a precise method for CR center localization and provides a solution to the data availability problem which is one of the important common roadblocks in the development of deep learning models for gaze estimation. Due to the superior CR center localization and ease of application, our method has the potential to improve the accuracy and precision of CR-based eye trackers
