Mind Your Vision: Multimodal Estimation of Refractive Disorders Using Electrooculography and Eye Tracking
Xin Wei, Huakun Liu, Yutaro Hirao, Monica Perusquia-Hernandez, Katsutoshi Masai, Hideaki Uchiyama, Kiyoshi Kiyokawa
TL;DR
This work investigates passive screening of refractive errors by analyzing eye movements with two modalities: electrooculography (EOG) and video-based eye tracking. An LSTM-based classifier is trained to map eye-movement features to $13$ diopter classes, evaluated in both within-subject (subject-dependent) and cross-subject (subject-independent) setups. The multimodal fusion of EOG and eye-tracking yields the best performance in the subject-dependent setting, with an average accuracy of $96.207\%$, but generalization across individuals is limited (average $8.882\%$) and not significantly better than unimodal baselines in the subject-independent evaluation. The results highlight the feasibility of continuous, non-invasive refractive power estimation and point to the need for personalized calibration and robust wearability to translate this approach to real-world screening tools.
Abstract
Refractive errors are among the most common visual impairments globally, yet their diagnosis often relies on active user participation and clinical oversight. This study explores a passive method for estimating refractive power using two eye movement recording techniques: electrooculography (EOG) and video-based eye tracking. Using a publicly available dataset recorded under varying diopter conditions, we trained Long Short-Term Memory (LSTM) models to classify refractive power from unimodal (EOG or eye tracking) and multimodal configuration. We assess performance in both subject-dependent and subject-independent settings to evaluate model personalization and generalizability across individuals. Results show that the multimodal model consistently outperforms unimodal models, achieving the highest average accuracy in both settings: 96.207\% in the subject-dependent scenario and 8.882\% in the subject-independent scenario. However, generalization remains limited, with classification accuracy only marginally above chance in the subject-independent evaluations. Statistical comparisons in the subject-dependent setting confirmed that the multimodal model significantly outperformed the EOG and eye-tracking models. However, no statistically significant differences were found in the subject-independent setting. Our findings demonstrate both the potential and current limitations of eye movement data-based refractive error estimation, contributing to the development of continuous, non-invasive screening methods using EOG signals and eye-tracking data.
