Table of Contents
Fetching ...

More than Chit-Chat: Developing Robots for Small-Talk Interactions

Rebecca Ramnauth, Dražen Brščić, Brian Scassellati

TL;DR

The paper tackles enabling natural small talk in social robots by diagnosing limitations of current LLM-driven small talk and introducing an observer-based feedback-redirection system that monitors and steers model outputs toward established small-talk conventions. Across chatbot and robot experiments, the approach improves human-likeness, naturalness, and coherence relative to base LLM setups, demonstrating effectiveness in both text and embodied interactions. The work offers a generalizable framework for enforcing system-prompt adherence in open-domain dialogue and highlights practical implications for reducing dead-ends and enhancing user rapport in HRI.

Abstract

Beyond mere formality, small talk plays a pivotal role in social dynamics, serving as a verbal handshake for building rapport and understanding. For conversational AI and social robots, the ability to engage in small talk enhances their perceived sociability, leading to more comfortable and natural user interactions. In this study, we evaluate the capacity of current Large Language Models (LLMs) to drive the small talk of a social robot and identify key areas for improvement. We introduce a novel method that autonomously generates feedback and ensures LLM-generated responses align with small talk conventions. Through several evaluations -- involving chatbot interactions and human-robot interactions -- we demonstrate the system's effectiveness in guiding LLM-generated responses toward realistic, human-like, and natural small-talk exchanges.

More than Chit-Chat: Developing Robots for Small-Talk Interactions

TL;DR

The paper tackles enabling natural small talk in social robots by diagnosing limitations of current LLM-driven small talk and introducing an observer-based feedback-redirection system that monitors and steers model outputs toward established small-talk conventions. Across chatbot and robot experiments, the approach improves human-likeness, naturalness, and coherence relative to base LLM setups, demonstrating effectiveness in both text and embodied interactions. The work offers a generalizable framework for enforcing system-prompt adherence in open-domain dialogue and highlights practical implications for reducing dead-ends and enhancing user rapport in HRI.

Abstract

Beyond mere formality, small talk plays a pivotal role in social dynamics, serving as a verbal handshake for building rapport and understanding. For conversational AI and social robots, the ability to engage in small talk enhances their perceived sociability, leading to more comfortable and natural user interactions. In this study, we evaluate the capacity of current Large Language Models (LLMs) to drive the small talk of a social robot and identify key areas for improvement. We introduce a novel method that autonomously generates feedback and ensures LLM-generated responses align with small talk conventions. Through several evaluations -- involving chatbot interactions and human-robot interactions -- we demonstrate the system's effectiveness in guiding LLM-generated responses toward realistic, human-like, and natural small-talk exchanges.

Paper Structure

This paper contains 15 sections, 2 equations, 5 figures.

Figures (5)

  • Figure 1: Robots that engage in naturalistic, small-talk conversations with users can foster rapport, enhance user comfort, and create more seamless human-robot interactions.
  • Figure 2: Human-Likeness of LLMs. This graph illustrates the extent of human likeness displayed by three LLMs, scored from 0 (no difference between human and model responses) to 4 (highest absolute difference). Each score reflects the similarity of the model's small talk to that of the participants.
  • Figure 3: System Components. This diagram outlines the architecture and processes that generate robot behaviors for autonomous small-talk interactions.
  • Figure 4: Human-Likeness of Observer v. Base Responses. The similarity of the models' small talk to that of the participants during text-based, chatbot interactions. Scores range from 0 (no difference) to 4 (highest absolute difference).
  • Figure 5: Observer v. Base in Online Assessments. Participant ratings of the human-likeness naturalness, responsiveness, and casualness of robot behaviors show that our system consistently outperformed the base model across all dimensions.