Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

Tim Schreiter; Lucas Morillo-Mendez; Ravi T. Chadalavada; Andrey Rudenko; Erik Billing; Martin Magnusson; Kai O. Arras; Achim J. Lilienthal

Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

Tim Schreiter, Lucas Morillo-Mendez, Ravi T. Chadalavada, Andrey Rudenko, Erik Billing, Martin Magnusson, Kai O. Arras, Achim J. Lilienthal

TL;DR

This study addresses the challenge of conveying robot intent in human–robot collaboration by introducing the Anthropomorphic Robotic Mock Driver (ARMoD) as an anthropomorphic communication channel. It compares verbal-only and multimodal ARMoD interaction styles across two experiments with different mobile robots, utilizing eye-tracking and standardized questionnaires to assess attention, perception, and task performance. The results show that multimodal cues, including gaze and pointing gestures, shift gaze toward the ARMoD, increase ARMoD-centered fixations, and significantly shorten reaction times to communicated instructions, though subjective ratings largely show no significant differences. These findings suggest that ARMoD-based multimodal communication can enhance engagement and efficiency in industrial HRI, with practical implications for improving reliability and speed of human–robot collaboration in workplace settings.

Abstract

Robots are increasingly used in shared environments with humans, making effective communication a necessity for successful human-robot interaction. In our work, we study a crucial component: active communication of robot intent. Here, we present an anthropomorphic solution where a humanoid robot communicates the intent of its host robot acting as an "Anthropomorphic Robotic Mock Driver" (ARMoD). We evaluate this approach in two experiments in which participants work alongside a mobile robot on various tasks, while the ARMoD communicates a need for human attention, when required, or gives instructions to collaborate on a joint task. The experiments feature two interaction styles of the ARMoD: a verbal-only mode using only speech and a multimodal mode, additionally including robotic gaze and pointing gestures to support communication and register intent in space. Our results show that the multimodal interaction style, including head movements and eye gaze as well as pointing gestures, leads to more natural fixation behavior. Participants naturally identified and fixated longer on the areas relevant for intent communication, and reacted faster to instructions in collaborative tasks. Our research further indicates that the ARMoD intent communication improves engagement and social interaction with mobile robots in workplace settings.

Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

TL;DR

Abstract

Paper Structure (15 sections, 8 figures)

This paper contains 15 sections, 8 figures.

Introduction
Related Work
Experimental Methodology and Design
Experiment A: Request of human assistance
Experiment B: Mediating joint navigation
Eye Tracker recordings
Participants
Results
Questionnaires
Gaze Behavior of Participants during the interactions
Discussion
Subjective user ratings
Gaze Behavior of Participants during Interactions
Conclusion and Future Work
Acknowledgement

Figures (8)

Figure 1: Participant encountering a mobile robot with a NAO robot mounted on top as the "Anthropomorphic Robotic Mock Driver" (ARMoD). The mobile robot communicates with participants through the ARMoD.
Figure 2: In Experiment A, participants interact with a robotic forklift. The ARMoD instructs the participants to place an object on the forks of the mobile robot.
Figure 3: Experimental setup for Experiment A, in which a human participant interacts with an Anthropomorphic Robotic Mock Driver (ARMoD) seated on a mobile robotic forklift. The participant begins at one end of a corridor, the forklift and ARMoD at the opposite end. The experiment involves the task to transport a tin can and later collaborate with the robot to place a box according to instructions on the forklift.
Figure 4: Flow chart illustrating the programmed behavior of the ARMoD during Experiment A in a hallway encounter. The sequence of events during each step of the interaction is shown from top to bottom. Dialogue spoken by the ARMoD is indicated by italicized text in quotes, while bold text indicates movements that are only present in the multimodal interaction style condition.
Figure 5: Experimental setup for Experiment B, which investigates the interaction between multiple participants and robots in a shared workplace setting. Participants navigate between designated goal points by drawing cards, as described in thor2020schreiter2022magni. Two special cards instruct participants using the phrase "Go to the robot" to look for the robot, approach and interact with it. The study aims to examine participants' behavior and perceptions during these interactions in a dynamic, realistic environment.
...and 3 more figures

Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

TL;DR

Abstract

Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver

Authors

TL;DR

Abstract

Table of Contents

Figures (8)