Table of Contents
Fetching ...

Singing the Body Electric: The Impact of Robot Embodiment on User Expectations

Nathaniel Dennler, Stefanos Nikolaidis, Maja Matarić

TL;DR

This work investigates how robot embodiment shapes users' pre-interaction mental models of social and functional capabilities. It introduces a regression-based framework using MUFaSAA data to map three modality families—hand-crafted features, metaphor embeddings, and image-based ViT features—onto six construct ratings from RoSAS and EmCorp via SVM regression. Key findings show pre-trained image and language-based features can match hand-crafted features, and that combining modalities often improves predictive accuracy, though gains over the best single modality are not always significant. The results offer practical guidance for designing robot embodiments and interaction strategies to align user expectations with actual capabilities.

Abstract

Users develop mental models of robots to conceptualize what kind of interactions they can have with those robots. The conceptualizations are often formed before interactions with the robot and are based only on observing the robot's physical design. As a result, understanding conceptualizations formed from physical design is necessary to understand how users intend to interact with the robot. We propose to use multimodal features of robot embodiments to predict what kinds of expectations users will have about a given robot's social and physical capabilities. We show that using such features provides information about general mental models of the robots that generalize across socially interactive robots. We describe how these models can be incorporated into interaction design and physical design for researchers working with socially interactive robots.

Singing the Body Electric: The Impact of Robot Embodiment on User Expectations

TL;DR

This work investigates how robot embodiment shapes users' pre-interaction mental models of social and functional capabilities. It introduces a regression-based framework using MUFaSAA data to map three modality families—hand-crafted features, metaphor embeddings, and image-based ViT features—onto six construct ratings from RoSAS and EmCorp via SVM regression. Key findings show pre-trained image and language-based features can match hand-crafted features, and that combining modalities often improves predictive accuracy, though gains over the best single modality are not always significant. The results offer practical guidance for designing robot embodiments and interaction strategies to align user expectations with actual capabilities.

Abstract

Users develop mental models of robots to conceptualize what kind of interactions they can have with those robots. The conceptualizations are often formed before interactions with the robot and are based only on observing the robot's physical design. As a result, understanding conceptualizations formed from physical design is necessary to understand how users intend to interact with the robot. We propose to use multimodal features of robot embodiments to predict what kinds of expectations users will have about a given robot's social and physical capabilities. We show that using such features provides information about general mental models of the robots that generalize across socially interactive robots. We describe how these models can be incorporated into interaction design and physical design for researchers working with socially interactive robots.
Paper Structure (17 sections, 1 figure, 1 table)