Does the Appearance of Autonomous Conversational Robots Affect User Spoken Behaviors in Real-World Conference Interactions?
Zi Haur Pang, Yahui Fu, Divesh Lala, Mikey Elmers, Koji Inoue, Tatsuya Kawahara
TL;DR
This study investigates how the appearance of autonomous conversational robots influences users' spoken behavior in real-world, conference-based interactions by comparing a highly human-like ERICA with a less anthropomorphic TELECO. Using transcripts from 42 participants and a broad set of NLP-derived linguistic, dialogue, emotion, and mimicry features, the authors find moderate effects: users produced more complex syntax and fewer disfluencies with ERICA, while TELECO elicited more disfluencies. A predictive modeling component demonstrates that Naïve Bayes best distinguishes robot human-likeness from speech features, with syntactic complexity and disfluency metrics driving performance; SHAP and permutation analyses highlight these features as key predictors. The work frames findings within Cognitive Load and Communication Accommodation Theory, suggesting robot design should target fluency and structured speech to improve communicative alignment, with implications for real-world HRI benchmarks and future work incorporating non-verbal cues and larger samples.
Abstract
We investigate the impact of robot appearance on users' spoken behavior during real-world interactions by comparing a human-like android, ERICA, with a less anthropomorphic humanoid, TELECO. Analyzing data from 42 participants at SIGDIAL 2024, we extracted linguistic features such as disfluencies and syntactic complexity from conversation transcripts. The results showed moderate effect sizes, suggesting that participants produced fewer disfluencies and employed more complex syntax when interacting with ERICA. Further analysis involving training classification models like Naïve Bayes, which achieved an F1-score of 71.60\%, and conducting feature importance analysis, highlighted the significant role of disfluencies and syntactic complexity in interactions with robots of varying human-like appearances. Discussing these findings within the frameworks of cognitive load and Communication Accommodation Theory, we conclude that designing robots to elicit more structured and fluent user speech can enhance their communicative alignment with humans.
