FocusFlow: 3D Gaze-Depth Interaction in Virtual Reality Leveraging Active Visual Depth Manipulation
Chenyang Zhang, Tiansu Chen, Eric Shaffer, Elahe Soltanaghai
TL;DR
This paper tackles enabling hands-free gaze-based interaction in VR by exploiting visual depth as an input dimension, addressing depth-estimation noise and the Midas Touch problem. It introduces FocusFlow, a system with a layer-based UI and a Virtual Window that uses adaptive visual cues and two learning strategies to train users in depth control. Through a pilot study and a 24-participant user study, the authors show that adaptive cues improve depth perception, reduce false triggers, and achieve activation times around 1.3 seconds, illustrating practical viability. Limitations include small sample size and fatigue concerns, with future work pointing toward multi-modal integration and continuous depth inputs to broaden applicability.
Abstract
Gaze interaction presents a promising avenue in Virtual Reality (VR) due to its intuitive and efficient user experience. Yet, the depth control inherent in our visual system remains underutilized in current methods. In this study, we introduce FocusFlow, a hands-free interaction method that capitalizes on human visual depth perception within the 3D scenes of Virtual Reality. We first develop a binocular visual depth detection algorithm to understand eye input characteristics. We then propose a layer-based user interface and introduce the concept of 'Virtual Window' that offers an intuitive and robust gaze-depth VR interaction, despite the constraints of visual depth accuracy and precision spatially at further distances. Finally, to help novice users actively manipulate their visual depth, we propose two learning strategies that use different visual cues to help users master visual depth control. Our user studies on 24 participants demonstrate the usability of our proposed virtual window concept as a gaze-depth interaction method. In addition, our findings reveal that the user experience can be enhanced through an effective learning process with adaptive visual cues, helping users to develop muscle memory for this brand-new input mechanism. We conclude the paper by discussing strategies to optimize learning and potential research topics of gaze-depth interaction.
