Table of Contents
Fetching ...

Ocular Authentication: Fusion of Gaze and Periocular Modalities

Dillon Lohr, Michael J. Proulx, Mehedi Hasan Raju, Oleg V. Komogortsev

TL;DR

The study investigates calibration-free ocular authentication by fusing gaze and periocular modalities within a scalable multimodal pipeline. It analyzes embedding properties across EMA ($128$-D), PIA ($256$-D), and EF1 ($256$-D) and demonstrates that EF1 fusion yields superior performance over unimodal baselines, even surpassing the FIDO benchmark. A key finding is the strong, exponential relationship between temporal persistence (KCC) and biometric performance (EER), with a high adjusted $R^2$ supporting the predictive value of temporal stability. The results support practical deployment of large-scale, calibration-free ocular authentication in VR/AR contexts.

Abstract

This paper investigates the feasibility of fusing two eye-centric authentication modalities-eye movements and periocular images-within a calibration-free authentication system. While each modality has independently shown promise for user authentication, their combination within a unified gaze-estimation pipeline has not been thoroughly explored at scale. In this report, we propose a multimodal authentication system and evaluate it using a large-scale in-house dataset comprising 9202 subjects with an eye tracking (ET) signal quality equivalent to a consumer-facing virtual reality (VR) device. Our results show that the multimodal approach consistently outperforms both unimodal systems across all scenarios, surpassing the FIDO benchmark. The integration of a state-of-the-art machine learning architecture contributed significantly to the overall authentication performance at scale, driven by the model's ability to capture authentication representations and the complementary discriminative characteristics of the fused modalities.

Ocular Authentication: Fusion of Gaze and Periocular Modalities

TL;DR

The study investigates calibration-free ocular authentication by fusing gaze and periocular modalities within a scalable multimodal pipeline. It analyzes embedding properties across EMA (-D), PIA (-D), and EF1 (-D) and demonstrates that EF1 fusion yields superior performance over unimodal baselines, even surpassing the FIDO benchmark. A key finding is the strong, exponential relationship between temporal persistence (KCC) and biometric performance (EER), with a high adjusted supporting the predictive value of temporal stability. The results support practical deployment of large-scale, calibration-free ocular authentication in VR/AR contexts.

Abstract

This paper investigates the feasibility of fusing two eye-centric authentication modalities-eye movements and periocular images-within a calibration-free authentication system. While each modality has independently shown promise for user authentication, their combination within a unified gaze-estimation pipeline has not been thoroughly explored at scale. In this report, we propose a multimodal authentication system and evaluate it using a large-scale in-house dataset comprising 9202 subjects with an eye tracking (ET) signal quality equivalent to a consumer-facing virtual reality (VR) device. Our results show that the multimodal approach consistently outperforms both unimodal systems across all scenarios, surpassing the FIDO benchmark. The integration of a state-of-the-art machine learning architecture contributed significantly to the overall authentication performance at scale, driven by the model's ability to capture authentication representations and the complementary discriminative characteristics of the fused modalities.

Paper Structure

This paper contains 2 sections, 1 figure.

Figures (1)

  • Figure 1: Observations of EER vs KCC. The Figure takes the median KCC across the n-embedding, where n=128 for EMA, 256 for PIA and Fused embeddings. An exponential curve is fit to the observations to highlight the trend. Adjusted $R^2$ is annotated in the figure.