Table of Contents
Fetching ...

Unified Understanding of Environment, Task, and Human for Human-Robot Interaction in Real-World Environments

Yuga Yano, Akinobu Mizutani, Yukiya Fukuda, Daiju Kanaoka, Tomohiro Ono, Hakaru Tamukoh

TL;DR

This work tackles real-world HRI in dynamic environments by introducing a four-layer indoor dynamic map, a task-representation framework for complex multi-step tasks, and a parallel response-generation system powered by large-language models. The approach enables a waiter robot to recognize and map static, semi-static, semi-dynamic, and dynamic elements, understand multi-action tasks from natural language, and communicate planned actions in real time. Experimental validation in a simulated restaurant with around 100 participants shows over 90% task accuracy and strong social acceptance, demonstrating practical viability and informing future improvements for home environments. Overall, the combination of structured environmental representation, task sequencing via representations, and parallel natural-language interaction significantly advances real-world HRI capability for service robots.

Abstract

To facilitate human--robot interaction (HRI) tasks in real-world scenarios, service robots must adapt to dynamic environments and understand the required tasks while effectively communicating with humans. To accomplish HRI in practice, we propose a novel indoor dynamic map, task understanding system, and response generation system. The indoor dynamic map optimizes robot behavior by managing an occupancy grid map and dynamic information, such as furniture and humans, in separate layers. The task understanding system targets tasks that require multiple actions, such as serving ordered items. Task representations that predefine the flow of necessary actions are applied to achieve highly accurate understanding. The response generation system is executed in parallel with task understanding to facilitate smooth HRI by informing humans of the subsequent actions of the robot. In this study, we focused on waiter duties in a restaurant setting as a representative application of HRI in a dynamic environment. We developed an HRI system that could perform tasks such as serving food and cleaning up while communicating with customers. In experiments conducted in a simulated restaurant environment, the proposed HRI system successfully communicated with customers and served ordered food with 90\% accuracy. In a questionnaire administered after the experiment, the HRI system of the robot received 4.2 points out of 5. These outcomes indicated the effectiveness of the proposed method and HRI system in executing waiter tasks in real-world environments.

Unified Understanding of Environment, Task, and Human for Human-Robot Interaction in Real-World Environments

TL;DR

This work tackles real-world HRI in dynamic environments by introducing a four-layer indoor dynamic map, a task-representation framework for complex multi-step tasks, and a parallel response-generation system powered by large-language models. The approach enables a waiter robot to recognize and map static, semi-static, semi-dynamic, and dynamic elements, understand multi-action tasks from natural language, and communicate planned actions in real time. Experimental validation in a simulated restaurant with around 100 participants shows over 90% task accuracy and strong social acceptance, demonstrating practical viability and informing future improvements for home environments. Overall, the combination of structured environmental representation, task sequencing via representations, and parallel natural-language interaction significantly advances real-world HRI capability for service robots.

Abstract

To facilitate human--robot interaction (HRI) tasks in real-world scenarios, service robots must adapt to dynamic environments and understand the required tasks while effectively communicating with humans. To accomplish HRI in practice, we propose a novel indoor dynamic map, task understanding system, and response generation system. The indoor dynamic map optimizes robot behavior by managing an occupancy grid map and dynamic information, such as furniture and humans, in separate layers. The task understanding system targets tasks that require multiple actions, such as serving ordered items. Task representations that predefine the flow of necessary actions are applied to achieve highly accurate understanding. The response generation system is executed in parallel with task understanding to facilitate smooth HRI by informing humans of the subsequent actions of the robot. In this study, we focused on waiter duties in a restaurant setting as a representative application of HRI in a dynamic environment. We developed an HRI system that could perform tasks such as serving food and cleaning up while communicating with customers. In experiments conducted in a simulated restaurant environment, the proposed HRI system successfully communicated with customers and served ordered food with 90\% accuracy. In a questionnaire administered after the experiment, the HRI system of the robot received 4.2 points out of 5. These outcomes indicated the effectiveness of the proposed method and HRI system in executing waiter tasks in real-world environments.

Paper Structure

This paper contains 21 sections, 15 figures.

Figures (15)

  • Figure 1: Complexity of real-world environment and our research target
  • Figure 2: Overview of dynamic map in autonomous driving
  • Figure 3: Human Support Robot (HSR) and its main sensors and actuators
  • Figure 4: Proposed indoor dynamic map for embracing HRI.
  • Figure 5: Static and semi-static information: Room information is incorporated through manual mapping to the occupancy grid map
  • ...and 10 more figures