Decoding Reading Goals from Eye Movements
Omer Shubi, Cfir Avraham Hadar, Yevgeni Berzak
TL;DR
This work investigates whether reading goals can be decoded from eye movements by distinguishing information seeking from ordinary reading. It empirically evaluates a broad suite of models, including transformer-based architectures that fuse scanpath data with text, and introduces a logistic ensemble that combines model predictions. The results show that fixation-level, text-aware transformers yield top single-model performance, with online predictions feasible before a reader finishes a passage, and that ensembles provide additional gains. An innovative mixed-effects analysis interprets model errors and identifies textual and reader factors that drive task difficulty, advancing understanding of variability in eye-movement patterns across reading regimes and informing practical applications in education and assistive technologies.
Abstract
Readers can have different goals with respect to the text that they are reading. Can these goals be decoded from their eye movements over the text? In this work, we examine for the first time whether it is possible to distinguish between two types of common reading goals: information seeking and ordinary reading for comprehension. Using large-scale eye tracking data, we address this task with a wide range of models that cover different architectural and data representation strategies, and further introduce a new model ensemble. We find that transformer-based models with scanpath representations coupled with language modeling solve it most successfully, and that accurate predictions can be made in real time, long before the participant finished reading the text. We further introduce a new method for model performance analysis based on mixed effect modeling. Combining this method with rich textual annotations reveals key properties of textual items and participants that contribute to the difficulty of the task, and improves our understanding of the variability in eye movement patterns across the two reading regimes.
