Table of Contents
Fetching ...

Evaluating Cross-Subject and Cross-Device Consistency in Visual Fixation Prediction

Yuli Wu, Henning Konermann, Emil Mededovic, Peter Walter, Johannes Stegmaier

TL;DR

This work evaluates cross-subject and cross-device consistency in visual fixation prediction using wearable Aria Glasses versus a high-end reference, on a 300-image MIT1003 subset collected from 9 participants. By applying a pretrained gaze model and standard saliency metrics, it demonstrates that average fixation maps generalize across devices for simple stimuli, but individual-level generalization remains weak, underscoring the value of group-level data for reliable saliency predictions. The findings inform neuroprosthetic applications and pave the way for leveraging average fixation information, with public data release to enable broader validation and development. Future work includes testing foundation-model-based gaze predictions on larger datasets and exploring scene simplification and personalized fixation models for complex scenes and visual impairments.

Abstract

Understanding cross-subject and cross-device consistency in visual fixation prediction is essential for advancing eye-tracking applications, including visual attention modeling and neuroprosthetics. This study evaluates fixation consistency using an embedded eye tracker integrated into regular-sized glasses, comparing its performance with high-end standalone eye-tracking systems. Nine participants viewed 300 images from the MIT1003 dataset in subjective experiments, allowing us to analyze cross-device and cross-subject variations in fixation patterns with various evaluation metrics. Our findings indicate that average visual fixations can be reliably transferred across devices for relatively simple stimuli. However, individual-to-average consistency remains weak, highlighting the challenges of predicting individual fixations across devices. These results provide an empirical foundation for leveraging predicted average visual fixation data to enhance neuroprosthetic applications.

Evaluating Cross-Subject and Cross-Device Consistency in Visual Fixation Prediction

TL;DR

This work evaluates cross-subject and cross-device consistency in visual fixation prediction using wearable Aria Glasses versus a high-end reference, on a 300-image MIT1003 subset collected from 9 participants. By applying a pretrained gaze model and standard saliency metrics, it demonstrates that average fixation maps generalize across devices for simple stimuli, but individual-level generalization remains weak, underscoring the value of group-level data for reliable saliency predictions. The findings inform neuroprosthetic applications and pave the way for leveraging average fixation information, with public data release to enable broader validation and development. Future work includes testing foundation-model-based gaze predictions on larger datasets and exploring scene simplification and personalized fixation models for complex scenes and visual impairments.

Abstract

Understanding cross-subject and cross-device consistency in visual fixation prediction is essential for advancing eye-tracking applications, including visual attention modeling and neuroprosthetics. This study evaluates fixation consistency using an embedded eye tracker integrated into regular-sized glasses, comparing its performance with high-end standalone eye-tracking systems. Nine participants viewed 300 images from the MIT1003 dataset in subjective experiments, allowing us to analyze cross-device and cross-subject variations in fixation patterns with various evaluation metrics. Our findings indicate that average visual fixations can be reliably transferred across devices for relatively simple stimuli. However, individual-to-average consistency remains weak, highlighting the challenges of predicting individual fixations across devices. These results provide an empirical foundation for leveraging predicted average visual fixation data to enhance neuroprosthetic applications.

Paper Structure

This paper contains 11 sections, 7 equations, 14 figures, 1 table.

Figures (14)

  • Figure 1: Application of the visual fixation prediction on neuroprosthetics.
  • Figure 2: Pipeline of visual fixation prediction. Using recorded videos from the monochrome cameras embedded in the Aria Glasses, we generate both the fixation map and the saliency map.
  • Figure 3: Visualization of fixations and saliency maps. We showcase five images from the MIT1003 dataset, along with the average saliency maps derived from both the MIT1003 ground truth (15 subjects using the ETL 400 ISCAN eye tracker) and our own experiments (9 subjects using the Aria Glasses). For the comparison of visual fixations and corresponding saliency maps, three random subjects are selected from the experimental group.
  • Figure 4: Cross-Subject Consistency. AUC-Judd (upper triangle in blue) and similarity scores (lower triangle in orange) are reported as mean ± standard deviation across individual subject pairs.
  • Figure 5: Scanpaths from 9 subjects on an example image. Fixation points are connected sequentially, with a temporal color gradient from red to violet. Best viewed in color and with zoom for details.
  • ...and 9 more figures