Eyes on Many: Evaluating Gaze, Hand, and Voice for Multi-Object Selection in Extended Reality

Mohammad Raihanul Bashar; Aunnoy K Mutasim; Ken Pfeuffer; Anil Ufuk Batmaz

Eyes on Many: Evaluating Gaze, Hand, and Voice for Multi-Object Selection in Extended Reality

Mohammad Raihanul Bashar, Aunnoy K Mutasim, Ken Pfeuffer, Anil Ufuk Batmaz

TL;DR

Eyes on Many investigates controller-free multi-object selection in XR by evaluating four mode-switching techniques and three subselection methods across moderate-to-large target sets. The study employs a within-subject design in a VR grid to measure speed, accuracy, and workload, identifying a strong advantage for persistent-mode toggles, particularly DoublePinch, combined with gaze+pinch subselection. Key findings show that SemiPinch underperforms due to instability and fatigue, while DoublePinch+gP delivers the best speed-accuracy-efficiency balance; voice reduces physical effort but repeated commands for subselection can be tedious. These results yield practical design guidelines for scalable, headset-native multi-selection in XR, emphasizing mode stability, efficient gaze confirmation, and careful use of hands-free input for high-level commands.

Abstract

Interacting with multiple objects simultaneously makes us fast. A pre-step to this interaction is to select the objects, i.e., multi-object selection, which is enabled through two steps: (1) toggling multi-selection mode -- mode-switching -- and then (2) selecting all the intended objects -- subselection. In extended reality (XR), each step can be performed with the eyes, hands, and voice. To examine how design choices affect user performance, we evaluated four mode-switching (SemiPinch, FullPinch, DoublePinch, and Voice) and three subselection techniques (Gaze+Dwell, Gaze+Pinch, and Gaze+Voice) in a user study. Results revealed that while DoublePinch paired with Gaze+Pinch yielded the highest overall performance, SemiPinch achieved the lowest performance. Although Voice-based mode-switching showed benefits, Gaze+Voice subselection was less favored, as the required repetitive vocal commands were perceived as tedious. Overall, these findings provide empirical insights and inform design recommendations for multi-selection techniques in XR.

Eyes on Many: Evaluating Gaze, Hand, and Voice for Multi-Object Selection in Extended Reality

TL;DR

Abstract

Paper Structure (34 sections, 8 figures)

This paper contains 34 sections, 8 figures.

Introduction
Related Work
Mode-Switching in XR
(Sub)Selection Techniques for Gaze-Based Systems
Design and Implementation of Multi-Selection Techniques
Rationale for Chosen Techniques
Multi-Selection Techniques
Serial Multi-Selection
Bimanual Multi-Selection
Mode-Switching Techniques
Subselection Techniques
User Study
Experimental Task
Experimental Design
Apparatus
...and 19 more sections

Figures (8)

Figure 1: The experimental task with the mode (top-left) and subselection (top-right) indicators.
Figure 2: Task completion time (TCT) results for the interaction between (a) mode-switching and subselection, (b) mode-switching and number of targets, (c) subselection and mode-switching, and (d) subselection and number of targets.
Figure 3: (a) Mode-switching time (MST) and (b) mode error (ME) results for the four mode-switching techniques.
Figure 4: Accidental subselection ratio (ASR) results for the interaction between (a) mode-switching and subselection, (b) mode-switching and number of targets, and (c) subselection and mode-switching.
Figure 5: Error rate (ER) results for the interaction between (a) mode-switching and subselection, (b) mode-switching and number of targets, (c) subselection and mode-switching, and (d) number of targets and mode-switching.
...and 3 more figures

Eyes on Many: Evaluating Gaze, Hand, and Voice for Multi-Object Selection in Extended Reality

TL;DR

Abstract

Eyes on Many: Evaluating Gaze, Hand, and Voice for Multi-Object Selection in Extended Reality

Authors

TL;DR

Abstract

Table of Contents

Figures (8)