GazeIntent: Adapting dwell-time selection in VR interaction with real-time intent modeling
Anish S. Narkar, Jan J. Michalak, Candace E. Peacock, Brendan David-John
TL;DR
GazeIntent addresses the Midas touch in gaze-only VR interaction by introducing a real-time gaze-intent model that scales dwell-time thresholds. An LSTM-based intent predictor (F1 = 0.94 offline) is trained on a VR divisibility dataset and deployed to adapt dwell times via a scaling factor computed from recent predictions, enabling faster and more reliable selections. In end-user studies with new and returning users, GI-G and especially GI-P approaches improve interaction speed and are preferred, demonstrating both task generalization and personalization benefits. While promising, the work notes limitations in sample size and calls for future work on threshold adaptation, continual learning, and broader task contexts, along with privacy considerations for predictive modeling in VR.
Abstract
The use of ML models to predict a user's cognitive state from behavioral data has been studied for various applications which includes predicting the intent to perform selections in VR. We developed a novel technique that uses gaze-based intent models to adapt dwell-time thresholds to aid gaze-only selection. A dataset of users performing selection in arithmetic tasks was used to develop intent prediction models (F1 = 0.94). We developed GazeIntent to adapt selection dwell times based on intent model outputs and conducted an end-user study with returning and new users performing additional tasks with varied selection frequencies. Personalized models for returning users effectively accounted for prior experience and were preferred by 63% of users. Our work provides the field with methods to adapt dwell-based selection to users, account for experience over time, and consider tasks that vary by selection frequency
