Table of Contents
Fetching ...

Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems

Viet Dung Nguyen, Reynold Bailey, Gabriel J. Diaz, Chengyi Ma, Alexander Fix, Alexander Ororbia

TL;DR

This work uses dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap, and demonstrates that these methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.

Abstract

Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate. Segmentation models trained using supervised machine learning can excel at this task, their effectiveness is determined by the degree of overlap between the narrow distributions of image properties defined by the target dataset and highly specific training datasets, of which there are few. Attempts to broaden the distribution of existing eye image datasets through the inclusion of synthetic eye images have found that a model trained on synthetic images will often fail to generalize back to real-world eye images. In remedy, we use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap. We demonstrate that our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.

Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems

TL;DR

This work uses dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap, and demonstrates that these methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.

Abstract

Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate. Segmentation models trained using supervised machine learning can excel at this task, their effectiveness is determined by the degree of overlap between the narrow distributions of image properties defined by the target dataset and highly specific training datasets, of which there are few. Attempts to broaden the distribution of existing eye image datasets through the inclusion of synthetic eye images have found that a model trained on synthetic images will often fail to generalize back to real-world eye images. In remedy, we use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data, and to prune the training dataset in a manner that maximizes distribution overlap. We demonstrate that our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
Paper Structure (11 sections, 14 equations, 6 figures, 3 tables)

This paper contains 11 sections, 14 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Overall process diagram of our proposed computational system for image segmentation. The synthetic images are first refined/processed using our novel Structure Retaining CycleGAN, then filtered by our Siamese Network that considers the distance between the latent representations of real and synthetic images, and finally placed into a training set that is used for training our adapted domain adversarial neural network.
  • Figure 2: Sample images from datasets used in our experiments. From left to right: OpenEDS (target domain), and four synthetic/constructed source domains - RITEyes, CGAN, SRCGAN, and SRCGAN-S.
  • Figure 3: Comparison of PCA plots of intermediate latent vectors for source and target domains produced by the DANN module (described in Section \ref{['subsec:dann-eye']}). Left: RITEyes (red) vs. OpenEDS (green). Middle: CGAN (red) vs. OpenEDS (green). Right: SRCGAN (red) vs. OpenEDS (green). Note that red dots inside the ellipse make up the SRCGAN-S distribution which represents filtered images that are close to the real distribution.
  • Figure 4: Model performance comparison on the real target dataset (OpenEDS) of the RITnet segmentation network (Left) and our DANN segmentation network (Right). Both models were trained on the $4$ source domains (see Figure \ref{['fig:sample-dataset']}). Shaded regions depict $\pm 1$ standard deviation in the $3$-fold cross validation scheme.
  • Figure 5: Sample distance prediction measurements of the Siamese Network for two images from OpenEDS dataset (left) and an image from RITEyes and one from OpenEDS respectively (right).
  • ...and 1 more figures