Table of Contents
Fetching ...

Mimicry and the Emergence of Cooperative Communication

Dylan Cope, Peter McBurney

TL;DR

This work tackles the problem of how cooperative communication can emerge among co-evolving agents by leveraging mimicry of externally generated, useful signals. It combines theoretical analysis of independent versus centralised optimisation with empirical tests in a gridworld, using deep neuroevolution and MAPPO to compare dynamics with and without mimicable signals. The key finding is that mimicry can alter optimisation trajectories, helping systems escape non-communicative local optima and fostering the emergence of communication, though the benefits depend on signal source disambiguation and can entail trade-offs in later refinement. The results offer a principled mechanism to bootstrap communication in multi-agent systems, with implications for designing cooperative AI and understanding language emergence, while outlining directions for handling initialisation effects and potential negative consequences of signal mimicry.

Abstract

In many situations, communication between agents is a critical component of cooperative multi-agent systems, however, it can be difficult to learn or evolve. In this paper, we investigate a simple way in which the emergence of communication may be facilitated. Namely, we explore the effects of when agents can mimic preexisting, externally generated useful signals. The key idea here is that these signals incentivise listeners to develop positive responses, that can then also be invoked by speakers mimicking those signals. This investigation starts with formalising this problem, and demonstrating that this form of mimicry changes optimisation dynamics and may provide the opportunity to escape non-communicative local optima. We then explore the problem empirically with a simulation in which spatially situated agents must communicate to collect resources. Our results show that both evolutionary optimisation and reinforcement learning may benefit from this intervention.

Mimicry and the Emergence of Cooperative Communication

TL;DR

This work tackles the problem of how cooperative communication can emerge among co-evolving agents by leveraging mimicry of externally generated, useful signals. It combines theoretical analysis of independent versus centralised optimisation with empirical tests in a gridworld, using deep neuroevolution and MAPPO to compare dynamics with and without mimicable signals. The key finding is that mimicry can alter optimisation trajectories, helping systems escape non-communicative local optima and fostering the emergence of communication, though the benefits depend on signal source disambiguation and can entail trade-offs in later refinement. The results offer a principled mechanism to bootstrap communication in multi-agent systems, with implications for designing cooperative AI and understanding language emergence, while outlining directions for handling initialisation effects and potential negative consequences of signal mimicry.

Abstract

In many situations, communication between agents is a critical component of cooperative multi-agent systems, however, it can be difficult to learn or evolve. In this paper, we investigate a simple way in which the emergence of communication may be facilitated. Namely, we explore the effects of when agents can mimic preexisting, externally generated useful signals. The key idea here is that these signals incentivise listeners to develop positive responses, that can then also be invoked by speakers mimicking those signals. This investigation starts with formalising this problem, and demonstrating that this form of mimicry changes optimisation dynamics and may provide the opportunity to escape non-communicative local optima. We then explore the problem empirically with a simulation in which spatially situated agents must communicate to collect resources. Our results show that both evolutionary optimisation and reinforcement learning may benefit from this intervention.
Paper Structure (17 sections, 3 theorems, 23 equations, 3 figures, 1 table)

This paper contains 17 sections, 3 theorems, 23 equations, 3 figures, 1 table.

Key Result

Theorem 1

Given a $\theta_1$ that implements an optimal non-communicative strategy, and a communicative strategy $\theta_2$. Under $\theta_2$, when the speaker sends a signal, the guess accuracy must be greater than $\alpha$ for $\theta_2$ to be selected. Formally, the following must hold for $\theta_2$ to be

Figures (3)

  • Figure 1: The environment for experimentally testing the effects of mimicry on the emergence of communication. Two agents must be on the same square as a resource to collect it and receive a reward. Agents observe whether or not they are on the same square as a resource, and resources sometimes emit signals that can be detected by an agent within a limited number of tiles from the resource. In (a) the gold region indicates where the signal can be detected by an agent.
  • Figure 2: Reward curves for Evolution (a) and DMARL (b). All curves are means from multiple seeds with standard error bars.
  • Figure 3: Partial overlap MAPPO reward curves aligned to the iteration in which the mean total reward exceeded 0, plotted alongside the frequency in which agents mimicked the 'externally generated signals', i.e. used resource signals.

Theorems & Definitions (5)

  • Theorem 1
  • proof
  • Theorem 2
  • proof
  • Lemma 1