Table of Contents
Fetching ...

Towards More Human-like AI Communication: A Review of Emergent Communication Research

Nicolo' Brandizzi

TL;DR

The paper surveys Emergent Communication (EmCom) as a path toward more human-like AI language by studying language emergence through interaction in multi-agent and reinforcement learning settings. It distinguishes Machine-centered EmCom (Mac-EmCom) from Human-centered EmCom (Hum-EmCom), and analyzes four core proprieties—game environment, input representation, learning paradigm, and Theory of Mind—to understand generalization and interpretability. It reviews methods, metrics, and challenges across both subfields, including iterative and population learning, language drift, and the balance between supervised and reinforcement learning, while advocating stronger cross-disciplinary integration with linguistics, cognitive science, and sociology. The authors emphasize that human-in-the-loop designs and human-centered grounding are essential to achieving robust, trustworthy human-machine communication with practical impact. The work also presents a comprehensive table of referenced papers, highlighting commonalities and gaps to guide future research in EmCom and human-agent collaboration.

Abstract

In the recent shift towards human-centric AI, the need for machines to accurately use natural language has become increasingly important. While a common approach to achieve this is to train large language models, this method presents a form of learning misalignment where the model may not capture the underlying structure and reasoning humans employ in using natural language, potentially leading to unexpected or unreliable behavior. Emergent communication (Emecom) is a field of research that has seen a growing number of publications in recent years, aiming to develop artificial agents capable of using natural language in a way that goes beyond simple discriminative tasks and can effectively communicate and learn new concepts. In this review, we present Emecom under two aspects. Firstly, we delineate all the common proprieties we find across the literature and how they relate to human interactions. Secondly, we identify two subcategories and highlight their characteristics and open challenges. We encourage researchers to work together by demonstrating that different methods can be viewed as diverse solutions to a common problem and emphasize the importance of including diverse perspectives and expertise in the field. We believe a deeper understanding of human communication is crucial to developing machines that can accurately use natural language in human-machine interactions.

Towards More Human-like AI Communication: A Review of Emergent Communication Research

TL;DR

The paper surveys Emergent Communication (EmCom) as a path toward more human-like AI language by studying language emergence through interaction in multi-agent and reinforcement learning settings. It distinguishes Machine-centered EmCom (Mac-EmCom) from Human-centered EmCom (Hum-EmCom), and analyzes four core proprieties—game environment, input representation, learning paradigm, and Theory of Mind—to understand generalization and interpretability. It reviews methods, metrics, and challenges across both subfields, including iterative and population learning, language drift, and the balance between supervised and reinforcement learning, while advocating stronger cross-disciplinary integration with linguistics, cognitive science, and sociology. The authors emphasize that human-in-the-loop designs and human-centered grounding are essential to achieving robust, trustworthy human-machine communication with practical impact. The work also presents a comprehensive table of referenced papers, highlighting commonalities and gaps to guide future research in EmCom and human-agent collaboration.

Abstract

In the recent shift towards human-centric AI, the need for machines to accurately use natural language has become increasingly important. While a common approach to achieve this is to train large language models, this method presents a form of learning misalignment where the model may not capture the underlying structure and reasoning humans employ in using natural language, potentially leading to unexpected or unreliable behavior. Emergent communication (Emecom) is a field of research that has seen a growing number of publications in recent years, aiming to develop artificial agents capable of using natural language in a way that goes beyond simple discriminative tasks and can effectively communicate and learn new concepts. In this review, we present Emecom under two aspects. Firstly, we delineate all the common proprieties we find across the literature and how they relate to human interactions. Secondly, we identify two subcategories and highlight their characteristics and open challenges. We encourage researchers to work together by demonstrating that different methods can be viewed as diverse solutions to a common problem and emphasize the importance of including diverse perspectives and expertise in the field. We believe a deeper understanding of human communication is crucial to developing machines that can accurately use natural language in human-machine interactions.
Paper Structure (65 sections, 1 equation, 9 figures, 1 table)

This paper contains 65 sections, 1 equation, 9 figures, 1 table.

Figures (9)

  • Figure 1: Exploring the multidisciplinary nature of Emergent Communication: A Venn Diagram showcasing the intersections between Linguistics, Cognitive Science, Computer Science, and Sociology. Each field contributes unique characteristics to the study of EmCom (shown in the figure as encompassing the other fields), with some commonalities across multiple fields. At the center of our analysis lies the crucial area of Human-Machine Interaction.
  • Figure 2: General pipeline for a discriminative referential game. The sender is shown a target image (a pencil) and is tasked to generate a message. The receiver sees a pool of images (distractors) containing the target and must choose the correct one based on the message.
  • Figure 3: Visualization of sampling graphs for 3-ary discrete $D \sim Discrete(\alpha)$ and 3-ary Concrete $X \sim Concrete(\alpha, \lambda)$. White operations are deterministic, blue are stochastic, rounded are continuous, square discrete. The top node is an example state; brightness indicates a value in [0,1]. Image and caption taken from Maddison2016.
  • Figure 4: Visualization of interaction types in Emergent Communication, with a horizontal space dimension and a vertical time dimension. The horizontal dimension is split into three parts for inner, outer cooperation, and outer competition. Teams are represented by squares, and their interconnections are indicated by arrows of different colors: green for cooperative, red for competitive, and gray dotted lines for time.
  • Figure 5: Illustration of Theory of Mind in artificial agents: Agent 2 must choose between pizza and gelato. In the modeling agents approach, Agent 1 predicts Agent 2's choice based on their preferences or past behavior. In the influencing others approach, Agent 1 takes action to influence Agent 2 to select a specific option.
  • ...and 4 more figures