Table of Contents
Fetching ...

Precise localization of corneal reflections in eye images using deep learning trained on synthetic data

Sean Anthony Byrne, Marcus Nyström, Virmarie Maquiling, Enkelejda Kasneci, Diederick C. Niehorster

TL;DR

The paper tackles precise CR center localization in eye images without real-data annotations by training a CNN on synthetically generated CR images. It introduces a seven-convolutional-layer CNN, trained in two stages, that localizes the CR center to sub-pixel accuracy and outperforms traditional methods on real eye images. The study demonstrates that synthetic data can yield high-performance CR localization and yields modest gains in gaze precision, while acknowledging pupil-noise as a limiting factor. This approach reduces data-labeling burdens and offers a flexible, scalable path for improving CR-based gaze trackers, with open-source code and data available for replication and extension. The work also highlights future directions, including pupil localization and handling multiple CRs, to extend applicability to real-world, low-quality imaging settings.

Abstract

We present a deep learning method for accurately localizing the center of a single corneal reflection (CR) in an eye image. Unlike previous approaches, we use a convolutional neural network (CNN) that was trained solely using simulated data. Using only simulated data has the benefit of completely sidestepping the time-consuming process of manual annotation that is required for supervised training on real eye images. To systematically evaluate the accuracy of our method, we first tested it on images with simulated CRs placed on different backgrounds and embedded in varying levels of noise. Second, we tested the method on high-quality videos captured from real eyes. Our method outperformed state-of-the-art algorithmic methods on real eye images with a 35% reduction in terms of spatial precision, and performed on par with state-of-the-art on simulated images in terms of spatial accuracy.We conclude that our method provides a precise method for CR center localization and provides a solution to the data availability problem which is one of the important common roadblocks in the development of deep learning models for gaze estimation. Due to the superior CR center localization and ease of application, our method has the potential to improve the accuracy and precision of CR-based eye trackers

Precise localization of corneal reflections in eye images using deep learning trained on synthetic data

TL;DR

The paper tackles precise CR center localization in eye images without real-data annotations by training a CNN on synthetically generated CR images. It introduces a seven-convolutional-layer CNN, trained in two stages, that localizes the CR center to sub-pixel accuracy and outperforms traditional methods on real eye images. The study demonstrates that synthetic data can yield high-performance CR localization and yields modest gains in gaze precision, while acknowledging pupil-noise as a limiting factor. This approach reduces data-labeling burdens and offers a flexible, scalable path for improving CR-based gaze trackers, with open-source code and data available for replication and extension. The work also highlights future directions, including pupil localization and handling multiple CRs, to extend applicability to real-world, low-quality imaging settings.

Abstract

We present a deep learning method for accurately localizing the center of a single corneal reflection (CR) in an eye image. Unlike previous approaches, we use a convolutional neural network (CNN) that was trained solely using simulated data. Using only simulated data has the benefit of completely sidestepping the time-consuming process of manual annotation that is required for supervised training on real eye images. To systematically evaluate the accuracy of our method, we first tested it on images with simulated CRs placed on different backgrounds and embedded in varying levels of noise. Second, we tested the method on high-quality videos captured from real eyes. Our method outperformed state-of-the-art algorithmic methods on real eye images with a 35% reduction in terms of spatial precision, and performed on par with state-of-the-art on simulated images in terms of spatial accuracy.We conclude that our method provides a precise method for CR center localization and provides a solution to the data availability problem which is one of the important common roadblocks in the development of deep learning models for gaze estimation. Due to the superior CR center localization and ease of application, our method has the potential to improve the accuracy and precision of CR-based eye trackers
Paper Structure (24 sections, 2 equations, 10 figures)

This paper contains 24 sections, 2 equations, 10 figures.

Figures (10)

  • Figure 1: Overview of our method: A CNN model with seven convolutional layers that increase in filter size from 64 to 512 and two dense layers returning the Cartesian coordinates of the CR center.
  • Figure 2: Example simulated CRs. Top row: example images used during model training and for the validation set. Left column: different values of Gaussian amplitude $A$. Right column: different pixel noise values $\sigma_n^{2}$ (image levels). For both columns, random positions (within $[-1.5r,1.5r]$) and orientations of the dividing line between the dark and light sections of the background are shown. Bottom row: example images used for evaluation, showing different background locations $E$ as well as a CR image without a gray background. The value for the varied parameter is denoted on the panels. $A$ was set to 10000 for all panels except the top-left. For illustration purposes, the CR radius ($r$) in these panels is 50 pixels. During both training and evaluation the pixel intensity of the lighter section of the background was also varied (not shown).
  • Figure 3: Full eye image (left) and masked cutout as processed by the radial symmetry and CNN methods (right).
  • Figure 4: Best achievable CR center localization errors for different Gaussian amplitudes $A$ (different panels) and CR radii $r$ (different lines in each panel).
  • Figure 5: Errors in CR center localization for different CR sizes $r$ for three methods. The panel insets show boxplots of the CR center localization error for each estimated input position. For all these simulations, $A=10000$, $E=0$, $I=128$
  • ...and 5 more figures