Table of Contents
Fetching ...

Emergent Communication through Negotiation

Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z Leibo, Karl Tuyls, Stephen Clark

TL;DR

This study investigates how communication emerges in a negotiation setting using multi-agent reinforcement learning by comparing a task-grounded proposal channel with a language-like cheap-talk channel. Self-interested agents learn fair negotiation when communication is grounded but fail to ground cheap talk, while prosocial agents successfully utilize cheap talk to coordinate toward near-optimal joint allocations. Extending to a society of heterogeneous agents, identifiability and community dynamics influence negotiation and language emergence, with language-like communication arising primarily among prosocial populations. The findings suggest cooperation is a prerequisite for language emergence and highlight the impact of social structure on the evolution of communicative behavior in multi-agent systems.

Abstract

Multi-agent reinforcement learning offers a way to study how communication could emerge in communities of agents needing to solve specific problems. In this paper, we study the emergence of communication in the negotiation environment, a semi-cooperative model of agent interaction. We introduce two communication protocols -- one grounded in the semantics of the game, and one which is \textit{a priori} ungrounded and is a form of cheap talk. We show that self-interested agents can use the pre-grounded communication channel to negotiate fairly, but are unable to effectively use the ungrounded channel. However, prosocial agents do learn to use cheap talk to find an optimal negotiating strategy, suggesting that cooperation is necessary for language to emerge. We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.

Emergent Communication through Negotiation

TL;DR

This study investigates how communication emerges in a negotiation setting using multi-agent reinforcement learning by comparing a task-grounded proposal channel with a language-like cheap-talk channel. Self-interested agents learn fair negotiation when communication is grounded but fail to ground cheap talk, while prosocial agents successfully utilize cheap talk to coordinate toward near-optimal joint allocations. Extending to a society of heterogeneous agents, identifiability and community dynamics influence negotiation and language emergence, with language-like communication arising primarily among prosocial populations. The findings suggest cooperation is a prerequisite for language emergence and highlight the impact of social structure on the evolution of communicative behavior in multi-agent systems.

Abstract

Multi-agent reinforcement learning offers a way to study how communication could emerge in communities of agents needing to solve specific problems. In this paper, we study the emergence of communication in the negotiation environment, a semi-cooperative model of agent interaction. We introduce two communication protocols -- one grounded in the semantics of the game, and one which is \textit{a priori} ungrounded and is a form of cheap talk. We show that self-interested agents can use the pre-grounded communication channel to negotiate fairly, but are unable to effectively use the ungrounded channel. However, prosocial agents do learn to use cheap talk to find an optimal negotiating strategy, suggesting that cooperation is necessary for language to emerge. We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.

Paper Structure

This paper contains 27 sections, 1 equation, 6 figures, 7 tables.

Figures (6)

  • Figure 1: High-level overview of the negotiation environment that we implement. Agent A consistently refers to the agent who goes first.
  • Figure 2: a) Training curves for self-interested agents learning to negotiate under the various communication channels. The results show the mean across 20 different random seeds, as well as bootstrapped confidence intervals via shading (only visible for the linguistic communication case) b) The optimality of the proposed item division as negotiation proceeds for two selfish agents communicating via the proposal channel, shown with error bars for interquartile range.
  • Figure 3: a) Unigram statistics of symbol usage broken down by turn and by position within the utterance for prosocial agents communicating via the linguistic channel. b) Bigram counts for prosocial agents communicating via the linguistic channel, sorted by frequency.
  • Figure 4: PCA plot of whitened opponent ID embeddings that was learnt by a fixed agent 1 for a variety of reward schemes and communication channels.
  • Figure 5: a) Unigram statistics of symbol usage broken down by turn and by position within the utterance for selfish agents communicating via the linguistic channel. b) Bigram counts for selfish agents communicating via the linguistic channel, sorted by frequency.
  • ...and 1 more figures