Table of Contents
Fetching ...

Do LLM Agents Exhibit Social Behavior?

Yan Leng, Yuan Yuan

TL;DR

It is demonstrated that utterance-based reasoning reliably predicts LLM' final actions; references to altruism, fairness, and cooperation in the reasoning increase the likelihood of prosocial actions, while mentions of self-interest and competition reduce them.

Abstract

As LLMs increasingly take on roles in human-AI interactions and autonomous AI systems, understanding their social behavior becomes important for informed use and continuous improvement. However, their behaviors in social interactions with humans and other agents, as well as the mechanisms shaping their responses, remain underexplored. To address this gap, we introduce a novel probabilistic framework, State-Understanding-Value-Action (SUVA), to systematically analyze LLM responses in social contexts based on their textual outputs (i.e., utterances). Using canonical behavioral economics games and social preference concepts relatable to LLM users, SUVA assesses LLMs' social behavior through both their final decisions and the response generation processes leading to those decisions. Our analysis of eight LLMs -- including two GPT, four LLaMA, and two Mistral models -- suggests that most models do not generate decisions aligned solely with self-interest; instead, they often produce responses that reflect social welfare considerations and display patterns consistent with direct and indirect reciprocity. Additionally, higher-capacity models more frequently display group identity effects. The SUVA framework also provides explainable tools -- including tree-based visualizations and probabilistic dependency analysis -- to elucidate how factors in LLMs' utterance-based reasoning influence their decisions. We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions; references to altruism, fairness, and cooperation in the reasoning increase the likelihood of prosocial actions, while mentions of self-interest and competition reduce them. Overall, our framework enables practitioners to assess LLMs for applications involving social interactions, and provides researchers with a structured method to interpret how LLM behavior arises from utterance-based reasoning.

Do LLM Agents Exhibit Social Behavior?

TL;DR

It is demonstrated that utterance-based reasoning reliably predicts LLM' final actions; references to altruism, fairness, and cooperation in the reasoning increase the likelihood of prosocial actions, while mentions of self-interest and competition reduce them.

Abstract

As LLMs increasingly take on roles in human-AI interactions and autonomous AI systems, understanding their social behavior becomes important for informed use and continuous improvement. However, their behaviors in social interactions with humans and other agents, as well as the mechanisms shaping their responses, remain underexplored. To address this gap, we introduce a novel probabilistic framework, State-Understanding-Value-Action (SUVA), to systematically analyze LLM responses in social contexts based on their textual outputs (i.e., utterances). Using canonical behavioral economics games and social preference concepts relatable to LLM users, SUVA assesses LLMs' social behavior through both their final decisions and the response generation processes leading to those decisions. Our analysis of eight LLMs -- including two GPT, four LLaMA, and two Mistral models -- suggests that most models do not generate decisions aligned solely with self-interest; instead, they often produce responses that reflect social welfare considerations and display patterns consistent with direct and indirect reciprocity. Additionally, higher-capacity models more frequently display group identity effects. The SUVA framework also provides explainable tools -- including tree-based visualizations and probabilistic dependency analysis -- to elucidate how factors in LLMs' utterance-based reasoning influence their decisions. We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions; references to altruism, fairness, and cooperation in the reasoning increase the likelihood of prosocial actions, while mentions of self-interest and competition reduce them. Overall, our framework enables practitioners to assess LLMs for applications involving social interactions, and provides researchers with a structured method to interpret how LLM behavior arises from utterance-based reasoning.
Paper Structure (71 sections, 6 equations, 12 figures, 5 tables, 1 algorithm)

This paper contains 71 sections, 6 equations, 12 figures, 5 tables, 1 algorithm.

Figures (12)

  • Figure 1: Illustration of abstracting token-by-token generation processes of LLMs under the SUVA framework.
  • Figure 2: Dictator games employed in the study. (a) Two-party, single-round dictator game measuring distributional preferences and group identity effects (varied by the group identities assigned to A and B). Numbers indicate payoffs to A and B. Player B (LLM model) determines payoff distribution among themselves (Player B) and Player A (simulated match). For instance, if B chooses B2, the payoffs to Players A and B are $(\pi^{B2, A}, \pi^{B2, B})=(200,600)$, respectively. (b) Two-party, two-round dictator game measuring direct reciprocity. Player B (LLM model) responds to Player A's (simulated match) choice, selecting between options indicating A's good intention or misbehavior. (c) Three-party, two-round dictator game measuring indirect reciprocity. Player B (LLM model) responds to Player A's (simulated match) choice, selecting between options indicating A's good intention or misbehavior toward Player C (third party).
  • Figure 3: [Main Result 1] Distributional preferences indicated by self interest, competition, difference aversion, and social welfare, reflected by regression coefficients. Error bars are 95% CIs.
  • Figure 4: [Main Result 2] Interaction effects of shared group identity and distributional preferences. Error bars are 95% CIs.
  • Figure 5: [Main Result 3] Reciprocity preferences for LLMs. The left and right panels represent direct and indirect reciprocity preferences, respectively. The error bars indicate the probability of the LLM behaving prosocially when informed that the match (opponent) has previously helped or misbehaved towards them (direct) or others (indirect). Error bars are 95% CIs.
  • ...and 7 more figures