Table of Contents
Fetching ...

A Comparative Study of Scanpath Models in Graph-Based Visualization

Angela Lopez-Cardona, Parvin Emami, Sebastian Idesis, Saravanakumar Duraisamy, Luis A. Leiva, Ioannis Arapakis

TL;DR

This study addresses the challenge of predicting human gaze on graph-based visualizations by comparing ground-truth eye-tracking data from 40 participants with synthetic scanpaths from three generative models (UMSS, DeepGaze++, Gazeformer). It systematically varies graph node count (3 vs 6) and question difficulty to assess model accuracy using DTW, EyeNalyis, Determinism, and Laminarity metrics. DeepGaze++ provides the best quantitative match, but UMSS yields more visually plausible scanpaths, while Gazeformer struggles due to limited fixations and training data, underscoring the importance of task encoding and dataset specificity. The work highlights the need for careful parameter tuning and potential fine-tuning on task-specific data to improve scanpath synthesis for infoVis design and evaluation.

Abstract

Information Visualization (InfoVis) systems utilize visual representations to enhance data interpretation. Understanding how visual attention is allocated is essential for optimizing interface design. However, collecting Eye-tracking (ET) data presents challenges related to cost, privacy, and scalability. Computational models provide alternatives for predicting gaze patterns, thereby advancing InfoVis research. In our study, we conducted an ET experiment with 40 participants who analyzed graphs while responding to questions of varying complexity within the context of digital forensics. We compared human scanpaths with synthetic ones generated by models such as DeepGaze, UMSS, and Gazeformer. Our research evaluates the accuracy of these models and examines how question complexity and number of nodes influence performance. This work contributes to the development of predictive modeling in visual analytics, offering insights that can enhance the design and effectiveness of InfoVis systems.

A Comparative Study of Scanpath Models in Graph-Based Visualization

TL;DR

This study addresses the challenge of predicting human gaze on graph-based visualizations by comparing ground-truth eye-tracking data from 40 participants with synthetic scanpaths from three generative models (UMSS, DeepGaze++, Gazeformer). It systematically varies graph node count (3 vs 6) and question difficulty to assess model accuracy using DTW, EyeNalyis, Determinism, and Laminarity metrics. DeepGaze++ provides the best quantitative match, but UMSS yields more visually plausible scanpaths, while Gazeformer struggles due to limited fixations and training data, underscoring the importance of task encoding and dataset specificity. The work highlights the need for careful parameter tuning and potential fine-tuning on task-specific data to improve scanpath synthesis for infoVis design and evaluation.

Abstract

Information Visualization (InfoVis) systems utilize visual representations to enhance data interpretation. Understanding how visual attention is allocated is essential for optimizing interface design. However, collecting Eye-tracking (ET) data presents challenges related to cost, privacy, and scalability. Computational models provide alternatives for predicting gaze patterns, thereby advancing InfoVis research. In our study, we conducted an ET experiment with 40 participants who analyzed graphs while responding to questions of varying complexity within the context of digital forensics. We compared human scanpaths with synthetic ones generated by models such as DeepGaze, UMSS, and Gazeformer. Our research evaluates the accuracy of these models and examines how question complexity and number of nodes influence performance. This work contributes to the development of predictive modeling in visual analytics, offering insights that can enhance the design and effectiveness of InfoVis systems.

Paper Structure

This paper contains 19 sections, 3 figures, 13 tables.

Figures (3)

  • Figure 1: Examples of graph adaptations, from baseline (a) to fully adapted (g) versions. Figures b- f represent the five possible partial adaptations.
  • Figure 2: Each trial followed a structured sequence: fixation cross, question screen, graph screen, and resting period. This procedure was repeated for all 120 trials.
  • Figure 3: Comparison of scanpaths generated by different models on a graph: (a) Actual eye movement sequence (ground truth); (b–d) Scanpaths predicted by UMSS, DeepGaze++, and Gazeformer, respectively. Colored dots represent fixation points, with connecting lines indicating the fixation order. Color intensity represents the temporal order of fixations, with yellow indicating the first point.