Table of Contents
Fetching ...

What Drives You to Interact?: The Role of User Motivation for a Robot in the Wild

Amy Koike, Yuki Okafuji, Kenya Hoshimure, Jun Baba

TL;DR

This paper addresses how user motivation shapes human-robot interaction in real-world settings by deploying a fully autonomous conversational robot in a shopping mall and analyzing 232 interaction groups with thick description and thematic coding. It identifies four motivation types (Function, Experimenters, Curiosity, Education) and five interaction fluency patterns (Smooth, Awkward, Active, Messy, Quiet), linking motivations to interaction dynamics. The findings show that tailoring robot behavior to user motivation can enhance engagement and satisfaction, offering concrete design implications for LLM-powered service robots operating in the wild. Methodologically, the work advances in-the-wild HRI research by combining field data, qualitative coding, and reliability assessment to derive actionable insights for real-world robot deployment.

Abstract

In this paper, we aim to understand how user motivation shapes human-robot interaction (HRI) in the wild. To explore this, we conducted a field study by deploying a fully autonomous conversational robot in a shopping mall over two days. Through sequential video analysis, we identified five patterns of interaction fluency (Smooth, Awkward, Active, Messy, and Quiet), four types of user motivation for interacting with the robot (Function, Experiment, Curiosity, and Education), and user positioning towards the robot. We further analyzed how these motivations and positioning influence interaction fluency. Our findings suggest that incorporating users' motivation types into the design of robot behavior can enhance interaction fluency, engagement, and user satisfaction in real-world HRI scenarios.

What Drives You to Interact?: The Role of User Motivation for a Robot in the Wild

TL;DR

This paper addresses how user motivation shapes human-robot interaction in real-world settings by deploying a fully autonomous conversational robot in a shopping mall and analyzing 232 interaction groups with thick description and thematic coding. It identifies four motivation types (Function, Experimenters, Curiosity, Education) and five interaction fluency patterns (Smooth, Awkward, Active, Messy, Quiet), linking motivations to interaction dynamics. The findings show that tailoring robot behavior to user motivation can enhance engagement and satisfaction, offering concrete design implications for LLM-powered service robots operating in the wild. Methodologically, the work advances in-the-wild HRI research by combining field data, qualitative coding, and reliability assessment to derive actionable insights for real-world robot deployment.

Abstract

In this paper, we aim to understand how user motivation shapes human-robot interaction (HRI) in the wild. To explore this, we conducted a field study by deploying a fully autonomous conversational robot in a shopping mall over two days. Through sequential video analysis, we identified five patterns of interaction fluency (Smooth, Awkward, Active, Messy, and Quiet), four types of user motivation for interacting with the robot (Function, Experiment, Curiosity, and Education), and user positioning towards the robot. We further analyzed how these motivations and positioning influence interaction fluency. Our findings suggest that incorporating users' motivation types into the design of robot behavior can enhance interaction fluency, engagement, and user satisfaction in real-world HRI scenarios.
Paper Structure (25 sections, 4 figures, 2 tables)

This paper contains 25 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: In this study, we deployed an autonomous conversational robot in a shopping mall. Our analysis revealed four types of user motivation for the robot: Function, Experiment, Curiosity, Education.
  • Figure 2: (a) Sota Robot: The Sota robot, a 27 cm-tall tabletop social humanoid, used in our study, is capable of performing bodily gestures. (b) System Setup: The system consisted of the robot and a display, which complemented the robot’s verbal interactions by showing usage instructions, internal states, and route guidance. (c) Study Setting: The system was deployed on the ground floor of a shopping mall, positioned next to a mall map. A video camera was installed behind the robot, and an experimental notice was placed beside the system.
  • Figure 3: System Architecture: Our autonomous conversational system consists of four components, from left to right: Recognition, Dialogue Management, Action Management, and Modality Control. The system can recognize human position and speech. Dialogue is managed using a state transition model, which defines three states based on visitors' proximity. Two types of actions are used for action management: template utterances and a generative model. Template utterances are triggered during a state transition, while the generative model is activated when user speech is recognized in S3, where users are nearby. Lastly, modalities such as speech synthesis, robot gestures, and display content are aligned with the action and the robot's current state. The only error the robot can detect is a Text-to-Speech failure, either from network or API issues. In response, Sota prompted users to retry with phrases such as "Hmm, sorry. Could you try again?" and displayed instructions on its screen.
  • Figure 4: A diagrammatic model of our findings: We found that user motivation--specifically, why they interacted with the robot--is a key factor in shaping human-robot interaction (HRI) in real-world settings. We identified four types of motivations: Function, Experiment, Curiosity, and Education, and examined how each motivation influences the interaction flow. By analyzing 232 interactions, we uncovered five distinct patterns of interaction flow: Smooth, Active, Awkward, Messy, and Quiet. This diagrammatic model illustrates how each motivation is connected to these interaction flow patterns except for Messy. We observed interactions where multiple motivations were likely present within the same group, often resulting in Messy interactions.