Table of Contents
Fetching ...

Appearance-based gaze estimation enhanced with synthetic images using deep neural networks

Dmytro Herashchenko, Igor Farkaš

TL;DR

This work builds a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection and head pose estimation and verified the feasibility of the model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.

Abstract

Human eye gaze estimation is an important cognitive ingredient for successful human-robot interaction, enabling the robot to read and predict human behavior. We approach this problem using artificial neural networks and build a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection (RetinaFace) and head pose estimation (6DRepNet). Our proposed method does not require any special hardware or infrared filters but uses a standard notebook-builtin RGB camera, as often approached with appearance-based methods. Using the MetaHuman tool, we also generated a large synthetic dataset of more than 57,000 human faces and made it publicly available. The inclusion of this dataset (with eye gaze and head pose information) on top of the standard Columbia Gaze dataset into training the model led to better accuracy with a mean average error below two degrees in eye pitch and yaw directions, which compares favourably to related methods. We also verified the feasibility of our model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.

Appearance-based gaze estimation enhanced with synthetic images using deep neural networks

TL;DR

This work builds a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection and head pose estimation and verified the feasibility of the model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.

Abstract

Human eye gaze estimation is an important cognitive ingredient for successful human-robot interaction, enabling the robot to read and predict human behavior. We approach this problem using artificial neural networks and build a modular system estimating gaze from separately cropped eyes, taking advantage of existing well-functioning components for face detection (RetinaFace) and head pose estimation (6DRepNet). Our proposed method does not require any special hardware or infrared filters but uses a standard notebook-builtin RGB camera, as often approached with appearance-based methods. Using the MetaHuman tool, we also generated a large synthetic dataset of more than 57,000 human faces and made it publicly available. The inclusion of this dataset (with eye gaze and head pose information) on top of the standard Columbia Gaze dataset into training the model led to better accuracy with a mean average error below two degrees in eye pitch and yaw directions, which compares favourably to related methods. We also verified the feasibility of our model by its preliminary testing in real-world setting using the builtin 4K camera in NICO semi-humanoid robot's eye.
Paper Structure (14 sections, 10 figures, 2 tables)

This paper contains 14 sections, 10 figures, 2 tables.

Figures (10)

  • Figure 1: A sample of Columbia gaze dataset smith2013.
  • Figure 2: Synthetic faces generated by the MetaHuman tool games2021.
  • Figure 3: A sample of images from our generated dataset.
  • Figure 4: Architecture of our eye gaze estimation system. Cropped eyes are enlarged to be better visible.
  • Figure 5: Cropped image using the RetinaFace module deng2019.
  • ...and 5 more figures