SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents
Ruiyi Wang, Haofei Yu, Wenxin Zhang, Zhengyang Qi, Maarten Sap, Graham Neubig, Yonatan Bisk, Hao Zhu
TL;DR
Social intelligence in language agents remains below human capabilities, motivating an interactive learning approach. SOTOPIA-pi combines dynamic social task generation, BC from GPT-4 expert data, and offline self-reinforcement with GPT-4-based feedback to train socially skilled agents, here built on Mistral-7B. Empirical results show the 7B base model approaching GPT-4 in social goal completion and gaining safety benefits while preserving general QA abilities; however, GPT-4-based evaluators tend to overestimate performance compared with human judgments, signaling evaluator bias. The work demonstrates a scalable offline pathway for improving social skills in LLMs and highlights the need for robust evaluation methods and potential online extensions.
Abstract
Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$π$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.
