Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality

Min-yung Kim; Jinwook Kim; Ken Pfeuffer; Sang Ho Yoon

Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality

Min-yung Kim, Jinwook Kim, Ken Pfeuffer, Sang Ho Yoon

Abstract

As extended reality (XR) technologies rapidly become as ubiquitous as today's mobile devices, supporting one-handed interaction becomes essential for XR. However, the prevalent Gaze + Pinch interaction model partially supports unimanual interaction, where users select, move, and rotate objects with one hand, but scaling typically requires both hands. In this work, we leverage the spatial alignment between gaze and hand as a mode switch to enable single-handed pinch-to-scale. We design and evaluate several techniques geared for one-handed scaling and assess their usability in a compound translate-scale task. Our findings show that all proposed methods effectively enable one-handed scaling, but each method offers distinct advantages and trade-offs. To this end, we derive design guidelines to support futuristic 3D interfaces with unimanual interaction. Our work helps make eye-hand 3D interaction in XR more mobile, flexible, and accessible.

Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality

Abstract

Paper Structure (40 sections, 5 figures, 2 tables)

This paper contains 40 sections, 5 figures, 2 tables.

Introduction
Related Work
Mode-switching Techniques for XR hand interactions
Unimanual Object Manipulation in XR
Design Space for Unimanual Object Manipulation
Gesture Type
Alignment Strategy
Control Parameter
User Study
Task Design
Implementation
Procedure
Evaluation Metrics
Result
Mode-switching Performance
...and 25 more sections

Figures (5)

Figure 1: Design space of unimanual scaling interactions. We derived four interactions: PTZ-Area, -Angle, -Span, and Push-Pull-Depth, by combining design factors of scaling gesture type, alignment strategy, and control parameter. We chose two unimanual scaling gestures: Pinch-to-Zoom (PTZ) (A-1) and Push-Pull (A-2). Three gaze-hand alignment strategies for mode-switch: stereoscopic view area overlap (purple rectangular area, B-1), angular dispersion between gaze–object and gaze–hand (cyan angle, B-2), and pinch gestures performed when the gaze, hand, and object were aligned(B-3) were tested. (C) Each input depicted in purple, blue, green, and orange represents control parameters used to compute scaling factors.
Figure 2: User study overview and temporal order of a single trial. Participants manipulated a white sphere to match the position and scale of a semi-transparent blue target sphere. Participants first translated the white sphere toward the blue sphere. The outline of the white sphere turned orange when switched to translation mode (A). Then, participants mode-in to scaling mode. When they switched to scaling mode, the outline color of the white sphere turned yellow (B-left). When the white sphere’s scale reached the target scale, it turned opaque blue, indicating that participants to switch out of the scaling mode (B-right). The trial ended when the participant mode-out of scaling.
Figure 3: Results on mode-switching performance, indicating robust distinction between modes. (A) Overall mode-switching error rates by Technique. Overall mode-switching error includes both the mode-in for translation and the scaling mode. Error rates closer to 100% indicate more frequent incorrect mode activations. (B) Proportions of mode-switching error rates by Technique. Multiple error types can occur from a single trial. (C) The mean mode-switching time it took for mode-in to translation. (D) The mean mode-switching time during the mode-in for scaling mode. Significance levels are indicated as *p < .05, **p < .01, ***p < .001; error bars represent standard error.
Figure 4: Results on scaling performance, showing accuracy in reaching the target scale. (A) Scaling error rate by technique. (B) Scaling error rate by Technique $\times$ Target Scale. (C) The mean scale difference shows the deviation between the final object and the target scale after mode-out. (D) The mean scale difference by Technique$\times$Target Scale. Significance levels are indicated as *p < .05, **p < .01, ***p < .001; error bars show standard error.
Figure 5: (A) The mean of mode-out time for scaling, the duration from scaling mode back to the idle state, by Technique. (B) Results on the ranking of user preferences. (C) Results of raw NASA-TLX score measured in 7-point scale by Techniques. (D) Subjective ratings of Techniques. Significance levels are indicated as *p < .05, **p < .01, ***p < .001; error bars represent standard error.

Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality

Abstract

Align-to-Scale: Mode Switching Technique for Unimanual 3D Object Manipulation with Gaze-Hand-Object Alignment in Extended Reality

Authors

Abstract

Table of Contents

Figures (5)