SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

Ruiyi Wang; Haofei Yu; Wenxin Zhang; Zhengyang Qi; Maarten Sap; Graham Neubig; Yonatan Bisk; Hao Zhu

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

Ruiyi Wang, Haofei Yu, Wenxin Zhang, Zhengyang Qi, Maarten Sap, Graham Neubig, Yonatan Bisk, Hao Zhu

TL;DR

Social intelligence in language agents remains below human capabilities, motivating an interactive learning approach. SOTOPIA-pi combines dynamic social task generation, BC from GPT-4 expert data, and offline self-reinforcement with GPT-4-based feedback to train socially skilled agents, here built on Mistral-7B. Empirical results show the 7B base model approaching GPT-4 in social goal completion and gaining safety benefits while preserving general QA abilities; however, GPT-4-based evaluators tend to overestimate performance compared with human judgments, signaling evaluator bias. The work demonstrates a scalable offline pathway for improving social skills in LLMs and highlights the need for robust evaluation methods and potential online extensions.

Abstract

Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$π$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

TL;DR

Abstract

, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

Paper Structure (46 sections, 14 figures, 9 tables)

This paper contains 46 sections, 14 figures, 9 tables.

Introduction
Background
SOTOPIA environment
Interactive learning
sotopia-pi framework
Experimental setting
Agent models
Training
Evaluation
Does sotopia-pi improve the social intelligence of language agents?
How does sotopia-pi influence other capabilities of LLMs
Related work
Social Intelligence in LLMs
Reinforcement Learning for LLMs
LLM Alignment and Evaluation
...and 31 more sections

Figures (14)

Figure 1: We propose sotopia-pi, which (1) automatically generates new social tasks, (2) collects data from both expert policy and agent policy for training, and (3) updates agent policy based on positive data rated by GPT-4. We implement (4) human and GPT-4 evaluation on our trained agent performing tasks in SOTOPIA with the partner agent. Our training paradigms include behavior cloning and self-reinforcement. For evaluation, we use SOTOPIA-EVAL and a fixed partner policy (GPT-3.5-based). Note that the character profiles are omitted and the examples are shortened for demonstration.
Figure 2: L: a social task with character profiles. R: An example turn from the perspective of the role-played character. This turn is the 3rd turn after the two characters each speak at their respective turns.
Figure 3: Prompt template for generating social tasks.
Figure 4: GPT-4-based automatic evaluation scores and human evaluation scores of the goal completion dimension. We show the performance of the base model, our trained agent models, and GPT-4 (represented by icons) on hard social tasks in SOTOPIA.
Figure 5: An example of the explanation of the believablity dimension of social annotation in the evaluation instruction page. Each annotator are asked to read similar definitions of social intelligence dimension and their corresponding annotation standards at the evaluation instruction page.
...and 9 more figures

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

TL;DR

Abstract

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (14)