Robot-Wearable Conversation Hand-off for Navigation
Dániel Szabó, Aku Visuri, Benjamin Tag, Simo Hosio
TL;DR
The paper addresses cognitive load in indoor navigation by introducing a conversation hand-off that transfers a CA from a stationary robot to a wearable. The system uses a local server architecture with Mimic 3 for TTS, Whisper for ASR, and RASA for dialogue, evaluated in a within-subject study with $N=24$ participants across Robot-Only, Wearable-Only, and Hand-off conditions on a university campus. Results show wearable embodiment is preferred, the hand-off is engaging, and there are no significant navigation-performance gains, with mean task times of $146.4$ s and average response times of $2.89$ s (robot), $4.26$ s (watch), and $3.96$ s (hand-off). Guided by these findings, the paper yields design considerations for shared voice/state, explicit triggers, and using the robot as an attention anchor to support cognitive augmentation through multi-embodiment public AI assistants.
Abstract
Navigating large and complex indoor environments, such as universities, airports, and hospitals, can be cognitively demanding and requires attention and effort. While mobile applications provide convenient navigation support, they occupy the user's hands and visual attention, limiting natural interaction. In this paper, we explore conversation hand-off as a method for multi-device indoor navigation, where a Conversational Agent (CA) transitions seamlessly from a stationary social robot to a wearable device. We evaluated robot-only, wearable-only, and robot-to-wearable hand-off in a university campus setting using a within-subjects design with N=24 participants. We find that conversation hand-off is experienced as engaging, even though no performance benefits were observed, and most preferred using the wearable-only system. Our findings suggest that the design of such re-embodied assistants should maintain a shared voice and state across embodiments. We demonstrate how conversational hand-offs can bridge cognitive and physical transitions, enriching human interaction with embodied AI.
