Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

Dillon Lohr; Michael J. Proulx; Oleg Komogortsev

Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

Dillon Lohr, Michael J. Proulx, Oleg Komogortsev

TL;DR

This study establishes a baseline for gaze-driven authentication in VR using the GazePro dataset, comprising $9202$ participants at $72$ Hz to evaluate a state-of-the-art embedding model. It systematically compares monocular vs binocular data, visual vs optical axes, and the impact of training size, data length, and signal quality on verification and identification. Key findings show that binocular gaze and dual-axis inputs drastically improve verification, that longer training and larger training sets yield better performance, and that while verification remains stable with larger galleries, identification degrades, with practical random-chance limits estimated around $1.48\times10^{5}$ identities. The results demonstrate FIDO-level potential for gaze authentication under realistic consumer hardware, given sufficient data, computation, and careful handling of signal quality and enrollment/verification durations, while also noting limitations in long-term and real-world deployment scenarios.

Abstract

This paper performs the crucial work of establishing a baseline for gaze-driven authentication performance to begin answering fundamental research questions using a very large dataset of gaze recordings from 9202 people with a level of eye tracking (ET) signal quality equivalent to modern consumer-facing virtual reality (VR) platforms. The size of the employed dataset is at least an order-of-magnitude larger than any other dataset from previous related work. Binocular estimates of the optical and visual axes of the eyes and a minimum duration for enrollment and verification are required for our model to achieve a false rejection rate (FRR) of below 3% at a false acceptance rate (FAR) of 1 in 50,000. In terms of identification accuracy which decreases with gallery size, we estimate that our model would fall below chance-level accuracy for gallery sizes of 148,000 or more. Our major findings indicate that gaze authentication can be as accurate as required by the FIDO standard when driven by a state-of-the-art machine learning architecture and a sufficiently large training dataset.

Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

TL;DR

This study establishes a baseline for gaze-driven authentication in VR using the GazePro dataset, comprising

participants at

Hz to evaluate a state-of-the-art embedding model. It systematically compares monocular vs binocular data, visual vs optical axes, and the impact of training size, data length, and signal quality on verification and identification. Key findings show that binocular gaze and dual-axis inputs drastically improve verification, that longer training and larger training sets yield better performance, and that while verification remains stable with larger galleries, identification degrades, with practical random-chance limits estimated around

identities. The results demonstrate FIDO-level potential for gaze authentication under realistic consumer hardware, given sufficient data, computation, and careful handling of signal quality and enrollment/verification durations, while also noting limitations in long-term and real-world deployment scenarios.

Abstract

Paper Structure (18 sections, 7 figures, 1 table)

This paper contains 18 sections, 7 figures, 1 table.

Introduction
Background
Methodology
Dataset
Model training
Model evaluation
Results
(RQ1) Monocular gaze vs binocular gaze.
(RQ2) Optical axis and visual axis.
(RQ3) Training epochs and minibatch size.
(RQ4) Training population size.
(RQ5) Testing population size.
(RQ6) Eye tracking signal quality.
(RQ7--8) Enrollment/verification duration.
(RQ9) Task-independence.
...and 3 more sections

Figures (7)

Figure 1: A simplified model of the eye, visualizing the imaginary optical and visual axes. The optical axis passes through the eyeball center and the pupil center. The visual axis connects the center of the fovea ("foveola") and the gaze target ("object of regard").
Figure 2: Overview diagram of the methodology for computing similarity scores.
Figure 3: Qualitative performance from Experiment 5. The left figure shows histograms of the genuine and impostor similarity score distributions. The right figure shows the curve.
Figure 4: Authentication performance vs training population size. Measurements are from Experiments 3 and 7--14.
Figure 5: Performance measures vs gallery size from Experiment 15. The shaded region represents the 5.0th and 95.0th percentiles across 100.0 random samples, and the black line is the average of the two percentiles. Top: . Bottom: Rank-1 .
...and 2 more figures

Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

TL;DR

Abstract

Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

Authors

TL;DR

Abstract

Table of Contents

Figures (7)