Table of Contents
Fetching ...

Real-Time Interactions Between Human Controllers and Remote Devices in Metaverse

Kan Chen, Zhen Meng, Xiangmin Xu, Changyang She, Philip G. Zhao

TL;DR

The paper tackles real-time interactions between human controllers and remote devices in the Metaverse by introducing a dual-paceted prediction framework that separately handles rendering and real-world control. A two-step human-in-the-loop DRL approach dynamically adjusts prediction horizons to balance rendering quality and control responsiveness, aided by expert policy guidance. A prototype demonstrates substantial reductions in Motion-To-Photon latency and RMSE across the operator, virtual model, and remote device, withMetrics improving from a PPO baseline to a significantly better performance after human-in-the-loop refinement. The results highlight the practical viability of decoupled virtual modeling, predictive control, and DRL-driven horizon adaptation for immersive, low-latency metaverse teleoperation and digital-twin applications.

Abstract

Supporting real-time interactions between human controllers and remote devices remains a challenging goal in the Metaverse due to the stringent requirements on computing workload, communication throughput, and round-trip latency. In this paper, we establish a novel framework for real-time interactions through the virtual models in the Metaverse. Specifically, we jointly predict the motion of the human controller for 1) proactive rendering in the Metaverse and 2) generating control commands to the real-world remote device in advance. The virtual model is decoupled into two components for rendering and control, respectively. To dynamically adjust the prediction horizons for rendering and control, we develop a two-step human-in-the-loop continuous reinforcement learning approach and use an expert policy to improve the training efficiency. An experimental prototype is built to verify our algorithm with different communication latencies. Compared with the baseline policy without prediction, our proposed method can reduce 1) the Motion-To-Photon (MTP) latency between human motion and rendering feedback and 2) the root mean squared error (RMSE) between human motion and real-world remote devices significantly.

Real-Time Interactions Between Human Controllers and Remote Devices in Metaverse

TL;DR

The paper tackles real-time interactions between human controllers and remote devices in the Metaverse by introducing a dual-paceted prediction framework that separately handles rendering and real-world control. A two-step human-in-the-loop DRL approach dynamically adjusts prediction horizons to balance rendering quality and control responsiveness, aided by expert policy guidance. A prototype demonstrates substantial reductions in Motion-To-Photon latency and RMSE across the operator, virtual model, and remote device, withMetrics improving from a PPO baseline to a significantly better performance after human-in-the-loop refinement. The results highlight the practical viability of decoupled virtual modeling, predictive control, and DRL-driven horizon adaptation for immersive, low-latency metaverse teleoperation and digital-twin applications.

Abstract

Supporting real-time interactions between human controllers and remote devices remains a challenging goal in the Metaverse due to the stringent requirements on computing workload, communication throughput, and round-trip latency. In this paper, we establish a novel framework for real-time interactions through the virtual models in the Metaverse. Specifically, we jointly predict the motion of the human controller for 1) proactive rendering in the Metaverse and 2) generating control commands to the real-world remote device in advance. The virtual model is decoupled into two components for rendering and control, respectively. To dynamically adjust the prediction horizons for rendering and control, we develop a two-step human-in-the-loop continuous reinforcement learning approach and use an expert policy to improve the training efficiency. An experimental prototype is built to verify our algorithm with different communication latencies. Compared with the baseline policy without prediction, our proposed method can reduce 1) the Motion-To-Photon (MTP) latency between human motion and rendering feedback and 2) the root mean squared error (RMSE) between human motion and real-world remote devices significantly.
Paper Structure (16 sections, 17 equations, 5 figures, 1 table)

This paper contains 16 sections, 17 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Proposed real-time interactions framework for humans, a real robotic arm, and its coupled virtual robotic arm in the Metaverse, where sensing, communication, predication, control, and rendering are considered.
  • Figure 2: The workflow of the proposed framework, where the modeling accuracy and the latency need to be satisfied.
  • Figure 3: Illustration of our prototype system.
  • Figure 4: Average in each training episode.
  • Figure 5: Average in each training episode.