Table of Contents
Fetching ...

Eyes Can't Always Tell: Fusing Eye Tracking and User Priors for User Modeling under AI Advice Conditions

Xin Sun, Shu Wei, Ting Pan, Yajing Wang, Jos A. Bosch, Isao Echizen, Abdallah El Ali, Saku Sugawara

Abstract

Modeling users' cognitive states (e.g., cognitive load and decision confidence) is essential for building adaptive AI in high-stakes decision-making. While eye tracking provides non-invasive behavioral signals correlated with cognitive effort, prior work has not systematically examined how AI assistance contexts, specifically varying advice reliability and user heterogeneity, can alter the mapping between gaze signals and cognitive states. We conducted a within-subject lab eye-tracking study (N=54) on factual verification tasks under three conditions: No-AI, Correct-AI advice, and Incorrect-AI advice. We analyze condition-dependent changes in self-reports and eye-tracking patterns and evaluate the robustness of eye-tracking-based user modeling. Results show that AI advice increases decision confidence compared to No-AI, while Correct-AI is associated with lower perceived cognitive load and more efficient gaze behavior. Crucially, predictive modeling is context-sensitive: the relationship between eye-tracking signals and cognitive states shifts across AI conditions. Finally, fusing eye-tracking features with user priors (demographics, AI literacy/experience, and propensity to trust technology) improves cross-participant generalization. These findings support condition-aware and personalized user modeling for cognitively aligned adaptive AI systems.

Eyes Can't Always Tell: Fusing Eye Tracking and User Priors for User Modeling under AI Advice Conditions

Abstract

Modeling users' cognitive states (e.g., cognitive load and decision confidence) is essential for building adaptive AI in high-stakes decision-making. While eye tracking provides non-invasive behavioral signals correlated with cognitive effort, prior work has not systematically examined how AI assistance contexts, specifically varying advice reliability and user heterogeneity, can alter the mapping between gaze signals and cognitive states. We conducted a within-subject lab eye-tracking study (N=54) on factual verification tasks under three conditions: No-AI, Correct-AI advice, and Incorrect-AI advice. We analyze condition-dependent changes in self-reports and eye-tracking patterns and evaluate the robustness of eye-tracking-based user modeling. Results show that AI advice increases decision confidence compared to No-AI, while Correct-AI is associated with lower perceived cognitive load and more efficient gaze behavior. Crucially, predictive modeling is context-sensitive: the relationship between eye-tracking signals and cognitive states shifts across AI conditions. Finally, fusing eye-tracking features with user priors (demographics, AI literacy/experience, and propensity to trust technology) improves cross-participant generalization. These findings support condition-aware and personalized user modeling for cognitively aligned adaptive AI systems.

Paper Structure

This paper contains 29 sections, 3 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Study overview and procedure.(Top): Three-step workflow. Step 1 collects eye-tracking signals and self-reports during factual verification under three within-subject AI conditions. Step 2 extracts trial-level eye-tracking signals and outcomes (cognitive load, decision confidence, and accuracy) together with participant-level user priors (e.g., demographics, AI literacy, propensity to trust). Step 3 trains machine learning models to predict users' cognitive load, decision confidence, and accuracy from eye-tracking features alone or fused with user priors, across AI conditions. (Bottom): Study procedure in Step 1: consent and pre-survey, followed by counterbalanced trials spanning three AI conditions with concurrent eye-tracking and self-reports.
  • Figure 2: Heatmaps of gaze per trial from conditions with or without AI. Higher density indicates greater visual attention.
  • Figure 3: Gaze features (i.e., Fixation/Saccade Count, Pupil Diameter, and Time-to-First-Fixation across AOIs (Context, AI Advice, and User Rating areas). Bars show the mean. Values in upper-left ($X^2$, p) are results from MixedLM model (FDR-corrected), and brackets indicate significant pairwise t-tests $(*p < .05, **p < .01, ***p < .001)$.
  • Figure 4: SHAP analysis: top 10 important features in user modeling on two AI conditions (Correct-AI vs. Incorrect-AI), to predict self-reported cognitive states (cognitive load and confidence) and decision accuracy by ExtraTree classifiers. ("Sacc" = saccades; "Fixa" = fixations; "Pupil" = Pupil Diameter; "Liter" = AI literacy; "Expe" = AI experience; "Demo" = Demographics;).