Table of Contents
Fetching ...

Bidirectional Emergent Language in Situated Environments

Cornelius Wolff, Julius Mayer, Elia Bruni, Xenia Ohmer

TL;DR

This paper introduces two novel cooperative environments: Multi-Agent Pong and Collectors, and employs various methods from explainable AI research to track and interpret the agents' language channel use over time, finding that the emerging communication is sparse.

Abstract

Emergent language research has made significant progress in recent years, but still largely fails to explore how communication emerges in more complex and situated multi-agent systems. Existing setups often employ a reference game, which limits the range of language emergence phenomena that can be studied, as the game consists of a single, purely language-based interaction between the agents. In this paper, we address these limitations and explore the emergence and utility of token-based communication in open-ended multi-agent environments, where situated agents interact with the environment through movement and communication over multiple time-steps. Specifically, we introduce two novel cooperative environments: Multi-Agent Pong and Collectors. These environments are interesting because optimal performance requires the emergence of a communication protocol, but moderate success can be achieved without one. By employing various methods from explainable AI research, such as saliency maps, perturbation, and diagnostic classifiers, we are able to track and interpret the agents' language channel use over time. We find that the emerging communication is sparse, with the agents only generating meaningful messages and acting upon incoming messages in states where they cannot succeed without coordination.

Bidirectional Emergent Language in Situated Environments

TL;DR

This paper introduces two novel cooperative environments: Multi-Agent Pong and Collectors, and employs various methods from explainable AI research to track and interpret the agents' language channel use over time, finding that the emerging communication is sparse.

Abstract

Emergent language research has made significant progress in recent years, but still largely fails to explore how communication emerges in more complex and situated multi-agent systems. Existing setups often employ a reference game, which limits the range of language emergence phenomena that can be studied, as the game consists of a single, purely language-based interaction between the agents. In this paper, we address these limitations and explore the emergence and utility of token-based communication in open-ended multi-agent environments, where situated agents interact with the environment through movement and communication over multiple time-steps. Specifically, we introduce two novel cooperative environments: Multi-Agent Pong and Collectors. These environments are interesting because optimal performance requires the emergence of a communication protocol, but moderate success can be achieved without one. By employing various methods from explainable AI research, such as saliency maps, perturbation, and diagnostic classifiers, we are able to track and interpret the agents' language channel use over time. We find that the emerging communication is sparse, with the agents only generating meaningful messages and acting upon incoming messages in states where they cannot succeed without coordination.
Paper Structure (26 sections, 2 equations, 6 figures, 6 tables)

This paper contains 26 sections, 2 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: The two environments used for our experiments.
  • Figure 2: Average length per epoch as a measure of how successful agents are at solving the environment.
  • Figure 3: Sensitivity of the language channel of the agents during an episode of Multi-Agent-Pong with a sequence length of 1.
  • Figure 4: The number of important messages -- defined by the saliency values -- strongly influences the agents' success during training for different sequence lengths.
  • Figure 5: The number of inputs utterances strongly influencing the agents' actions during training for different sequence lengths. An utterance is important when the at least one normalized salience reaches a value of 0.8.
  • ...and 1 more figures