Advantages of Multimodal versus Verbal-Only Robot-to-Human Communication with an Anthropomorphic Robotic Mock Driver
Tim Schreiter, Lucas Morillo-Mendez, Ravi T. Chadalavada, Andrey Rudenko, Erik Billing, Martin Magnusson, Kai O. Arras, Achim J. Lilienthal
TL;DR
This study addresses the challenge of conveying robot intent in human–robot collaboration by introducing the Anthropomorphic Robotic Mock Driver (ARMoD) as an anthropomorphic communication channel. It compares verbal-only and multimodal ARMoD interaction styles across two experiments with different mobile robots, utilizing eye-tracking and standardized questionnaires to assess attention, perception, and task performance. The results show that multimodal cues, including gaze and pointing gestures, shift gaze toward the ARMoD, increase ARMoD-centered fixations, and significantly shorten reaction times to communicated instructions, though subjective ratings largely show no significant differences. These findings suggest that ARMoD-based multimodal communication can enhance engagement and efficiency in industrial HRI, with practical implications for improving reliability and speed of human–robot collaboration in workplace settings.
Abstract
Robots are increasingly used in shared environments with humans, making effective communication a necessity for successful human-robot interaction. In our work, we study a crucial component: active communication of robot intent. Here, we present an anthropomorphic solution where a humanoid robot communicates the intent of its host robot acting as an "Anthropomorphic Robotic Mock Driver" (ARMoD). We evaluate this approach in two experiments in which participants work alongside a mobile robot on various tasks, while the ARMoD communicates a need for human attention, when required, or gives instructions to collaborate on a joint task. The experiments feature two interaction styles of the ARMoD: a verbal-only mode using only speech and a multimodal mode, additionally including robotic gaze and pointing gestures to support communication and register intent in space. Our results show that the multimodal interaction style, including head movements and eye gaze as well as pointing gestures, leads to more natural fixation behavior. Participants naturally identified and fixated longer on the areas relevant for intent communication, and reacted faster to instructions in collaborative tasks. Our research further indicates that the ARMoD intent communication improves engagement and social interaction with mobile robots in workplace settings.
