Table of Contents
Fetching ...

Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking

Xin Wei, Huakun Liu, Yutaro Hirao, Monica Perusquia-Hernandez, Katsutoshi Masai, Hideaki Uchiyama, Kiyoshi Kiyokawa

TL;DR

This work investigates passive screening of refractive errors by analyzing eye movements with two modalities: electrooculography (EOG) and video-based eye tracking. An LSTM-based classifier is trained to map eye-movement features to $13$ diopter classes, evaluated in both within-subject (subject-dependent) and cross-subject (subject-independent) setups. The multimodal fusion of EOG and eye-tracking yields the best performance in the subject-dependent setting, with an average accuracy of $96.207\%$, but generalization across individuals is limited (average $8.882\%$) and not significantly better than unimodal baselines in the subject-independent evaluation. The results highlight the feasibility of continuous, non-invasive refractive power estimation and point to the need for personalized calibration and robust wearability to translate this approach to real-world screening tools.

Abstract

Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.

Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking

TL;DR

This work investigates passive screening of refractive errors by analyzing eye movements with two modalities: electrooculography (EOG) and video-based eye tracking. An LSTM-based classifier is trained to map eye-movement features to diopter classes, evaluated in both within-subject (subject-dependent) and cross-subject (subject-independent) setups. The multimodal fusion of EOG and eye-tracking yields the best performance in the subject-dependent setting, with an average accuracy of , but generalization across individuals is limited (average ) and not significantly better than unimodal baselines in the subject-independent evaluation. The results highlight the feasibility of continuous, non-invasive refractive power estimation and point to the need for personalized calibration and robust wearability to translate this approach to real-world screening tools.

Abstract

Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.

Paper Structure

This paper contains 10 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Illustration of the processing techniques. This figure presents a flowchart of the signal processing steps.
  • Figure 2: Classification accuracy of EOG, eye-tracking, and multimodal models for each participant in the subject-dependent scenario. Bars represent mean accuracy across eight cross-validation folds per subject, with error bars indicating the standard error of the mean (SEM). On average, the EOG model achieved 84.451% accuracy, the eye-tracking model 92.432%, and the multimodal model 96.207%.
  • Figure 3: Confusion matrices for EOG, eye-tracking, and multimodal models in subject-dependent scenario.
  • Figure 4: Classification accuracy of EOG, eye-tracking, and multimodal models in the subject-independent scenario. Bars represent mean accuracy across 37 leave-one-subject-out cross-validation folds, with each participant serving once as the test subject. Error bars indicate the standard error of the mean (SEM). On average, the EOG model achieved 7.936% accuracy, the eye-tracking model 8.640%, and the multimodal model 8.882%. The chance level for this 13-class classification task is 7.692%.
  • Figure 5: Confusion matrices for EOG, eye-tracking, and multimodal models in subject-independent scenario.