IMMERTWIN: A Mixed Reality Framework for Enhanced Robotic Arm Teleoperation

Florent P. Audonnet; Ixchel G. Ramirez-Alpizar; Gerardo Aragon-Camarasa

IMMERTWIN: A Mixed Reality Framework for Enhanced Robotic Arm Teleoperation

Florent P. Audonnet, Ixchel G. Ramirez-Alpizar, Gerardo Aragon-Camarasa

TL;DR

IMMERTWIN addresses the problem of limited situational awareness and cognitive load in robotic teleoperation by embedding operators in a mixed-reality digital twin that closes the loop with real robots. The approach combines a virtual gripper controlled via VR with TELESIM motion planning and near-real-time 3D point-cloud feedback within Unreal Engine 5.4 and ROS2 to enable immersive, plug-and-play teleoperation. Through a 26-participant study across two robots (UR3 and Baxter) performing a tower-stacking task, IMMERTWIN reduces mental workload and is preferred by users, though objective manipulation metrics show limited gains over TELESIM. The work highlights the trade-offs of MR interfaces for teleoperation and points to future directions in ergonomics, realism, and alternate tasks.

Abstract

We present IMMERTWIN, a mixed reality framework for enhance robotic arm teleoperation using a closed-loop digital twin as a bridge for interaction between the user and the robotic system. We evaluated IMMERTWIN by performing a medium-scale user survey with 26 participants on two robots. Users were asked to teleoperate with both robots inside the virtual environment to pick and place 3 cubes in a tower and to repeat this task as many times as possible in 10 minutes, with only 5 minutes of training beforehand. Our experimental results show that most users were able to succeed by building at least a tower of 3 cubes regardless of the robot used and a maximum of 10 towers (1 tower per minute). In addition, users preferred to use IMMERTWIN over our previous work, TELESIM, as it caused them less mental workload. The project website and source code can be found at: https://cvas-ug.github.io/immertwin

IMMERTWIN: A Mixed Reality Framework for Enhanced Robotic Arm Teleoperation

TL;DR

Abstract

Paper Structure (9 sections, 6 figures)

This paper contains 9 sections, 6 figures.

Introduction
Background
IMMERTWIN Framework
Experimental Setup
TELESIM
IMMERTWIN
Hardware
Evaluation
Conclusion and Future Work

Figures (6)

Figure 1: Our experimental setup comprises the following components. (1) The view from the user inside Unreal Engine, with their virtual hands (A) and the virtual gripper (B). (2) The user wearing the VR headset, with a black security tape (C) to avoid users walk towards the robots. (3) The room containing both the UR3 robot (D) and the Baxter robot (E), along with 4 ZED 2I cameras (F) to live stream a pointcloud into the virtual environment.
Figure 2: Overview of IMMERTWIN. Our framework, IMMERWTIN, shown in the green dotted line, accepts the pose of any 3D VR controller (shown in the black dotted line) to update the position of the user's virtual hand in Unreal Engine. The user can then grab the robotic gripper and move the robot where they want. The new robotic goal is then transmitted to TELESIM (shown in the blue dotted line) to perform motion planning and collision avoidance. The state of the virtual robot is then transmitted to the real robot, shown in the red dotted line, to update its position. Finally, the state of the real robot is transmitted back into Unreal Engine to create a closed-loop digital twin.
Figure 3: Population percentage for each tower completed for both robots for TELESIM and IMMERTWIN, respectively.
Figure 4: Ratio of different statistics collected during the experiment. The Placing Rate is calculated as the number of place actions over the number of picking actions. The Collapse Rate is calculated as the number of Collapse actions over the number of picking actions. The Still in Place Rate is calculated as the number of Place actions minus the number of collapses over the number of picking actions, effectively rating the tower's stability.
Figure 5: Result of the raw NASA-TLX mental aspect, which evaluates how mentally demanding the task was. A low score indicates a lower effort. The horizontal bar at the top indicates a significance value between the 2 items indicated by the ticks at both ends of the bar.
...and 1 more figures

IMMERTWIN: A Mixed Reality Framework for Enhanced Robotic Arm Teleoperation

TL;DR

Abstract

IMMERTWIN: A Mixed Reality Framework for Enhanced Robotic Arm Teleoperation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)