Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning

Rebekah A. Gelpí; Yikai Tang; Ethan C. Jackson; William A. Cunningham

Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning

Rebekah A. Gelpí, Yikai Tang, Ethan C. Jackson, William A. Cunningham

TL;DR

A computational model of dynamic social coordination is used to illustrate how this “feedback loop” can emerge, engendering and entrenching role-consistent stereotypic behavior and then shows that human behavior on the task generates a comparable feedback loop.

Abstract

Despite often being perceived as morally objectionable, stereotypes are a common feature of social groups, a phenomenon that has often been attributed to biased motivations or limits on the ability to process information. We argue that one reason for this continued prevalence is that pre-existing expectations about how others will behave, in the context of social coordination, can change the behaviors of one's social partners, creating the very stereotype one expected to see, even in the absence of other potential sources of stereotyping. We use a computational model of dynamic social coordination to illustrate how this "feedback loop" can emerge, engendering and entrenching stereotypic behavior, and then show that human behavior on the task generates a comparable feedback loop. Notably, people's choices on the task are not related to social dominance or system justification, suggesting biased motivations are not necessary to maintain these stereotypes.

Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning

TL;DR

Abstract

Paper Structure (3 sections, 8 equations, 5 figures)

This paper contains 3 sections, 8 equations, 5 figures.

Collective Rewards.
Market Game.
Agent Game.

Figures (5)

Figure 1: Non-technical schematic of task design. Agents (top) with varying skill specializations observe their own resources and input this information into a neural network to determine their actions. The market decider (bottom) observes the agent's appearance when it chooses to sell to the market, and input this information into its own neural network to determine whether to predict that the agent will sell wood or stone. Successful sales (and the agents'/market decider's reward) depend on the prediction matching the agent's action.
Figure 3: Market predictions with differing population sizes. In Study 1 (A), the market decider learns to predict that agents will sell the resource that matches their skill. Larger population sizes take longer to learn, but does not affect the ultimate frequency of skill-based predictions. In Study 2 (B), the market decider’s predictions quickly converge to skill-based predictions the majority group members (i.e., chopping specialists sell wood and mining specialists sell stone); however, the market decider’s predictions for minority members depends on group size, such that the market decider makes mostly skilled-based predictions for the minority in the population size of 100, but almost exclusively stereotypic predictions for the population size of 600.
Figure 4: Raw learning signal and market predictions for unobserved agents. (A): Proportion of stereotypic predictions by the market decider for previously observed majority agents (blue), minority agents (red), and held-out unobserved agents (green) in Study 3. The market stably predicts that unseen agents will mostly engage in stereotypic actions at a greater rate than the ground truth skill distribution (dashed red line), even as the underlying skill distribution changes. (B): Market decider's raw learning signal for mining and chopping groups. As the market decider's expectations prompt minority agents to engage in against-skill behaviors, the decider's learning signal exaggerates the underlying difference in skill sets, and maintains it after the underlying difference disappears.
Figure 5: Agent task with human participants. Participants were assigned to play the role of agents the produce-and-trade task in which the market predicted the agent would bring resources consistent with the agent's true skill (blue) or contrary to the participant's true skill (red). Participants predicted to behave skill-consistently by the market decider progressively made a higher proportion of skill-consistent sales over time, while this proportion declined for participants for which the market made mostly skill-inconsistent prediction (a). Further, the proportion of agents for which the market made skill-consistent predictions was higher for the initial agents than for the agents from Replacement 2 and Replacement 5 (b).
Figure 6: Market task with human participants. Participants were assigned to play the role of the market decider in the produce-and-trade task and predict the resources brought by agents to the market. Across all three conditions, participants predicted that agents who formed the majority within their group would behave more skill-consistently than agents who formed the minority, and this pattern became stronger over time such that in later trials, participants made more stereotypic predictions about both the majority and the minority.

Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning

TL;DR

Abstract

Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)