Table of Contents
Fetching ...

Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace

Chenxu Wang, Boyuan Du, Jiaxin Xu, Peiyan Li, Di Guo, Huaping Liu

TL;DR

HumanTHOR is an AI2THOR-based embodied simulator that enables real-time HITL human-robot collaboration in a shared workspace through VR interfaces. It provides a multimodal communication framework (image-text messages), a modular API suite, and a benchmark stack for everyday tasks (object-goal navigation and mobile manipulation) generated via scene priors. A preliminary user study with rule-based Frontier and Oracle baselines demonstrates that robot assistance improves performance and that the platform can differentiate robot capabilities and human trust. The system is designed to be scalable and extensible, supporting multi-robot setups, more complex tasks, and learning-based robot algorithms, thereby offering a practical testbed for HRC research and trust modeling.

Abstract

Human-robot collaboration (HRC) in a shared workspace has become a common pattern in real-world robot applications and has garnered significant research interest. However, most existing studies for human-in-the-loop (HITL) collaboration with robots in a shared workspace evaluate in either simplified game environments or physical platforms, falling short in limited realistic significance or limited scalability. To support future studies, we build an embodied framework named HumanTHOR, which enables humans to act in the simulation environment through VR devices to support HITL collaborations in a shared workspace. To validate our system, we build a benchmark of everyday tasks and conduct a preliminary user study with two baseline algorithms. The results show that the robot can effectively assist humans in collaboration, demonstrating the significance of HRC. The comparison among different levels of baselines affirms that our system can adequately evaluate robot capabilities and serve as a benchmark for different robot algorithms. The experimental results also indicate that there is still much room in the area and our system can provide a preliminary foundation for future HRC research in a shared workspace. More information about the simulation environment, experiment videos, benchmark descriptions, and additional supplementary materials can be found on the website: https://sites.google.com/view/humanthor/.

Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace

TL;DR

HumanTHOR is an AI2THOR-based embodied simulator that enables real-time HITL human-robot collaboration in a shared workspace through VR interfaces. It provides a multimodal communication framework (image-text messages), a modular API suite, and a benchmark stack for everyday tasks (object-goal navigation and mobile manipulation) generated via scene priors. A preliminary user study with rule-based Frontier and Oracle baselines demonstrates that robot assistance improves performance and that the platform can differentiate robot capabilities and human trust. The system is designed to be scalable and extensible, supporting multi-robot setups, more complex tasks, and learning-based robot algorithms, thereby offering a practical testbed for HRC research and trust modeling.

Abstract

Human-robot collaboration (HRC) in a shared workspace has become a common pattern in real-world robot applications and has garnered significant research interest. However, most existing studies for human-in-the-loop (HITL) collaboration with robots in a shared workspace evaluate in either simplified game environments or physical platforms, falling short in limited realistic significance or limited scalability. To support future studies, we build an embodied framework named HumanTHOR, which enables humans to act in the simulation environment through VR devices to support HITL collaborations in a shared workspace. To validate our system, we build a benchmark of everyday tasks and conduct a preliminary user study with two baseline algorithms. The results show that the robot can effectively assist humans in collaboration, demonstrating the significance of HRC. The comparison among different levels of baselines affirms that our system can adequately evaluate robot capabilities and serve as a benchmark for different robot algorithms. The experimental results also indicate that there is still much room in the area and our system can provide a preliminary foundation for future HRC research in a shared workspace. More information about the simulation environment, experiment videos, benchmark descriptions, and additional supplementary materials can be found on the website: https://sites.google.com/view/humanthor/.
Paper Structure (23 sections, 1 equation, 14 figures, 2 tables)

This paper contains 23 sections, 1 equation, 14 figures, 2 tables.

Figures (14)

  • Figure 1: An overview of the HumanTHOR system, where the human can act in the simulator through the VR device with the first-person view akin to the robot. The system also supports the top-down view with instant displaying of the positions and orientations of the human and the robot.
  • Figure 2: The architecture of our HumanTHOR system.
  • Figure 3: Interacting with the environment with VR devices. (a) Moving the human avatar by operating the joystick on the VR controller. (b) With the help of the sensors on the head-mounted display, humans can conveniently rotate the angle of view by turning around in reality. (c) When being close enough to an object, humans can pick up a movable object by pressing the side button as shown in the figure, or open a receptacle such as a fridge. (d) After receiving a message from the robot, humans can make quick responses with the A/B buttons on the controller. In our benchmark of collaborative tasks, the message is presented in a dialogue box in the human view, where button A is for confirmation and button B is for decline. After confirmation, a map with the relative positions of the robot and human will be displayed, which can also be hidden or redisplayed by the button operation.
  • Figure 4: The hierarchical benchmarks supported by the HumanTHOR platform.
  • Figure 5: The general processes of navigation tasks and manipulation tasks. The mobile manipulation task is more complex and difficult since it requires further manipulation after successfully finding the target.
  • ...and 9 more figures