Table of Contents
Fetching ...

Large Language Models as Zero-Shot Human Models for Human-Robot Interaction

Bowen Zhang, Harold Soh

TL;DR

The results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI, and it is demonstrated how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios focused on the important element of trust.

Abstract

Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.

Large Language Models as Zero-Shot Human Models for Human-Robot Interaction

TL;DR

The results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI, and it is demonstrated how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios focused on the important element of trust.

Abstract

Human models play a crucial role in human-robot interaction (HRI), enabling robots to consider the impact of their actions on people and plan their behavior accordingly. However, crafting good human models is challenging; capturing context-dependent human behavior requires significant prior knowledge and/or large amounts of interaction data, both of which are difficult to obtain. In this work, we explore the potential of large-language models (LLMs) -- which have consumed vast amounts of human-generated text data -- to act as zero-shot human models for HRI. Our experiments on three social datasets yield promising results; the LLMs are able to achieve performance comparable to purpose-built models. That said, we also discuss current limitations, such as sensitivity to prompts and spatial/numerical reasoning mishaps. Based on our findings, we demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios. Specifically, we present one case study on a simulated trust-based table-clearing task and replicate past results that relied on custom models. Next, we conduct a new robot utensil-passing experiment (n = 65) where preliminary results show that planning with a LLM-based human model can achieve gains over a basic myopic plan. In summary, our results show that LLMs offer a promising (but incomplete) approach to human modeling for HRI.
Paper Structure (13 sections, 1 equation, 6 figures, 4 tables)

This paper contains 13 sections, 1 equation, 6 figures, 4 tables.

Figures (6)

  • Figure 1: In this work, we explore how large language models (LLMs) can be used as zero-shot human models in HRI. We first evaluate the effectiveness of such models using benchmark datasets. Then, we demonstrate how LLM-based models can be used in planning for two trust-related HRI scenarios.
  • Figure 2: Datasets and Example Prompts in Prediction Experiments. We use two HRI datasets: MANNERS-DB tjomsland2022mind and Trust-Transfer soh2018transfersoh2020multi, and included SocialIQA sap2019socialiqa, a general social reasoning benchmark for human interactions. For each dataset, we show an illustrative image (reproduced from tjomsland2022mindsoh2020multisap2019socialiqa, respectively) and an example prompt.
  • Figure 3: Example altered prompt used in trust-transfer experiment in the household domain.
  • Figure 4: Example prompt used in table-clearing experiment.
  • Figure 5: (Left) Utensils used for the experiment: spatula, egg whisk, scissors and knife. (Right) The experiment environment emulates a kitchen. Utensil tray is highlighted for better visibility.
  • ...and 1 more figures