Evaluating Cross-Subject and Cross-Device Consistency in Visual Fixation Prediction
Yuli Wu, Henning Konermann, Emil Mededovic, Peter Walter, Johannes Stegmaier
TL;DR
This work evaluates cross-subject and cross-device consistency in visual fixation prediction using wearable Aria Glasses versus a high-end reference, on a 300-image MIT1003 subset collected from 9 participants. By applying a pretrained gaze model and standard saliency metrics, it demonstrates that average fixation maps generalize across devices for simple stimuli, but individual-level generalization remains weak, underscoring the value of group-level data for reliable saliency predictions. The findings inform neuroprosthetic applications and pave the way for leveraging average fixation information, with public data release to enable broader validation and development. Future work includes testing foundation-model-based gaze predictions on larger datasets and exploring scene simplification and personalized fixation models for complex scenes and visual impairments.
Abstract
Understanding cross-subject and cross-device consistency in visual fixation prediction is essential for advancing eye-tracking applications, including visual attention modeling and neuroprosthetics. This study evaluates fixation consistency using an embedded eye tracker integrated into regular-sized glasses, comparing its performance with high-end standalone eye-tracking systems. Nine participants viewed 300 images from the MIT1003 dataset in subjective experiments, allowing us to analyze cross-device and cross-subject variations in fixation patterns with various evaluation metrics. Our findings indicate that average visual fixations can be reliably transferred across devices for relatively simple stimuli. However, individual-to-average consistency remains weak, highlighting the challenges of predicting individual fixations across devices. These results provide an empirical foundation for leveraging predicted average visual fixation data to enhance neuroprosthetic applications.
