Table of Contents
Fetching ...

From Interaction to Demonstration Quality in Virtual Reality: Effects of Interaction Modality and Visual Representation on Everyday Tasks

Robin Beierling, Manuel Scheibl, Jonas Dech, Abhijit Vyas, Anna-Lisa Vollmer

TL;DR

This study systematically compares three VR input configurations—motion capture gloves with hand visualization (M), controllers with hand visualization (H), and controllers with controller visualization (C)—to assess their impact on usability, workload, and task execution in a VR kitchen across ADLs. It combines subjective measures (SUS, NASA-TLX) with trajectory-based objective metrics, including semantic trajectories and similarity measures such as $DTW$, $DFD$, and $Levenshtein\ distance$, to reveal task-dependent trade-offs between efficiency and naturalism. The authors introduce a semantic trajectory framework to segment tasks into subtasks, enabling granular analysis of interaction quality and error patterns. Findings indicate no universal winner: controllers excel in goal-oriented tasks for speed and consistency, while motion capture gloves promote natural, semantically faithful behavior in manner-oriented tasks, with gestures offering intuitive but less precise control. These insights inform VR interaction design and have practical implications for VR-based training, rehabilitation, and demonstration data quality.

Abstract

Virtual Reality (VR) is increasingly used for training and demonstration purposes including a variety of applications ranging from robot learning to rehabilitation. However, the choice of input device and its visualization might influence workload and thus user performance leading to suboptimal demonstrations or reduced training effects. This study investigates how different VR input configurations - motion capture gloves, controllers with hand visualization, and controllers with controller visualization - affect user experience and task execution, with the goal of identifying which configuration is best suited for which type of task. Participants performed various kitchen-related activities of daily living (ADLs), including object placement, cutting, cleaning, and pouring in a simulated environment. To address two research questions, we evaluated user experience using the System Usability Scale and NASA Task Load Index (RQ1), and task-specific interaction behavior (RQ2). The latter was assessed using trajectory segmentation, analyzing movement efficiency, unnecessary actions, and execution precision. While no significant differences in overall usability and workload were found, trajectory analysis revealed configuration-specific execution behaviors with different movement strategies. Controllers enabled significantly faster task completion with less movement variability in pick-and-place style tasks such as table setting. In contrast, motion capture gloves produced more natural movements with fewer unnecessary actions, but also showed greater variance in movement patterns for manner-oriented tasks such as cutting bread. These findings highlight trade-offs between efficiency and naturalism, and have implications for optimizing VR-based training, improving the quality of user-generated demonstrations, and tailoring interaction design to specific application goals.

From Interaction to Demonstration Quality in Virtual Reality: Effects of Interaction Modality and Visual Representation on Everyday Tasks

TL;DR

This study systematically compares three VR input configurations—motion capture gloves with hand visualization (M), controllers with hand visualization (H), and controllers with controller visualization (C)—to assess their impact on usability, workload, and task execution in a VR kitchen across ADLs. It combines subjective measures (SUS, NASA-TLX) with trajectory-based objective metrics, including semantic trajectories and similarity measures such as , , and , to reveal task-dependent trade-offs between efficiency and naturalism. The authors introduce a semantic trajectory framework to segment tasks into subtasks, enabling granular analysis of interaction quality and error patterns. Findings indicate no universal winner: controllers excel in goal-oriented tasks for speed and consistency, while motion capture gloves promote natural, semantically faithful behavior in manner-oriented tasks, with gestures offering intuitive but less precise control. These insights inform VR interaction design and have practical implications for VR-based training, rehabilitation, and demonstration data quality.

Abstract

Virtual Reality (VR) is increasingly used for training and demonstration purposes including a variety of applications ranging from robot learning to rehabilitation. However, the choice of input device and its visualization might influence workload and thus user performance leading to suboptimal demonstrations or reduced training effects. This study investigates how different VR input configurations - motion capture gloves, controllers with hand visualization, and controllers with controller visualization - affect user experience and task execution, with the goal of identifying which configuration is best suited for which type of task. Participants performed various kitchen-related activities of daily living (ADLs), including object placement, cutting, cleaning, and pouring in a simulated environment. To address two research questions, we evaluated user experience using the System Usability Scale and NASA Task Load Index (RQ1), and task-specific interaction behavior (RQ2). The latter was assessed using trajectory segmentation, analyzing movement efficiency, unnecessary actions, and execution precision. While no significant differences in overall usability and workload were found, trajectory analysis revealed configuration-specific execution behaviors with different movement strategies. Controllers enabled significantly faster task completion with less movement variability in pick-and-place style tasks such as table setting. In contrast, motion capture gloves produced more natural movements with fewer unnecessary actions, but also showed greater variance in movement patterns for manner-oriented tasks such as cutting bread. These findings highlight trade-offs between efficiency and naturalism, and have implications for optimizing VR-based training, improving the quality of user-generated demonstrations, and tailoring interaction design to specific application goals.
Paper Structure (32 sections, 12 equations, 6 figures, 7 tables)

This paper contains 32 sections, 12 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The simulated virtual kitchen for the experiments
  • Figure 2: The ICs and their visualization. The lower row shows the condition which are either the Manus motion capture gloves (left column) or the Valve Index controller (middle and right column). The upper row shows the virtual visualization which is either the hand reconstruction (left and middle column) or only the controller (right column). For all images currently the virtual milk pitcher of the pouring scenario is grasped and held.
  • Figure 3: Overview of two scenes used in the study: (a) Introduction and (b) Table Setup.
  • Figure 4: Two scenes used in the study: (a) Dishwasher task focusing on precision and placement, (b) Cutting task focussing of the motions of the knife.
  • Figure 5: 2 Scenes focused on continuous motion: (a) Cleaning, requiring vertical movement control; (b) Pouring, requiring slow and precise execution to avoid spilling.
  • ...and 1 more figures