Exploration enhances cooperation in the multi-agent communication system

Zhao Song; Chen Shen; Zhen Wang; The Anh Han

Exploration enhances cooperation in the multi-agent communication system

Zhao Song, Chen Shen, Zhen Wang, The Anh Han

TL;DR

This work proposes a two-stage evolutionary game-theoretical model, integrating signalling with a donation game, with exploration explicitly incorporated into the decision-making, suggesting that embracing strategic exploration, as a form of engineered randomness, is essential to sustain cooperation and realise optimal performance in communication-based intelligent systems.

Abstract

Designing protocols enhancing cooperation for multi-agent systems remains a grand challenge. Cheap talk, defined as costless, non-binding communication before formal action, serves as a pivotal solution. However, existing theoretical frameworks often exclude random exploration, or noise, for analytical tractability, leaving its functional impact on system performance largely unexplored. To bridge this gap, we propose a two-stage evolutionary game-theoretical model, integrating signalling with a donation game, with exploration explicitly incorporated into the decision-making. Our agent-based simulations across topologies reveal a universal optimal exploration rate that maximises system-wide cooperation. Mechanistically, moderate exploration undermines the stability of defection and catalyses the self-organised cooperative alliances, facilitating their cyclic success. Moreover, the cooperation peak is enabled by the delicate balance between oscillation period and amplification. Our findings suggest that rather than pursuing deterministic rigidity, embracing strategic exploration, as a form of engineered randomness, is essential to sustain cooperation and realise optimal performance in communication-based intelligent systems.

Exploration enhances cooperation in the multi-agent communication system

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 7 figures, 1 table, 1 algorithm)

This paper contains 17 sections, 5 equations, 7 figures, 1 table, 1 algorithm.

Introduction
Preliminaries
Model and Method
The Two-Stage Cheap Talk Game
Payoff Formulation
Population Updates
Exploration and Imitation
Experimental Results
Implementation Details
Cheap talk remains a powerful tool for cooperation
Optimal exploration enhances overall cooperation
Exploration enables the success of cooperative strategies
Optimal exploration balances the oscillation period and amplification.
Exploration is not the panacea
Discussion
...and 2 more sections

Figures (7)

Figure 1: Two-stage cheap talk game and population structures. (a) The game includes two stage: pre-game communication and in-game decision-making. In each evolutionary update step, a focal player, either explores a new random strategy or learn the strategy from a random role model. (b) Examples of the investigated population structures: square lattice, small-world, random, well-mixed, and scale-free networks.
Figure 2: Cheap talk sustains cooperation in the exploration scenarios, with optimal exploration maximising the overall cooperation. Panels (a)-(c) show the frequency of cooperation across parameter sets without and with exploration; panels (d)-(f) show the average frequency and the standard deviation of cooperation as a function of exploration. Shown are the results of independent simulations on lattice, small world, and random networks, respectively and uniformly sampled $r \in [0,0.3]$, $\gamma \in [0,0.3]$, and $\mu \in [10^{-5},10^{-\frac{3}{4}}]$.
Figure 3: Cheap talk can not sustain cooperation in the absence of network reciprocity, while exploration disturbs cooperation on heterogeneous networks. Panels (a) and (c) show the frequency of cooperation across parameter sets without and with exploration; panels (b) and (d) show the average frequency and the standard deviation of cooperation as a function of exploration. Shown are the results of independent simulations on well-mixed and scale-free networks, respectively and uniformly sampled $r \in [0,0.3]$, $\gamma \in [0,0.3]$, and $\mu \in [10^{-5},10^{-\frac{3}{4}}]$.
Figure 4: exploration enables the optimal success of cooperative strategies across both intuitive and deliberative types. Panels show the average frequency and the standard deviation of each strategy as a function of exploration on lattice, small world, and random networks, from top to bottom. Shown are the results of 100,000 simulations and randomly sampling from uniform distributions of $\mu \in [10^{-5},10^{\frac{3}{4}}]$, $r \in [0,0.3]$, $\gamma \in [0,0.3]$.
Figure 5: Moderate exploration maximises cooperation by balancing the amplitude and period of cyclic success. Shown is the time evolution of strategy frequencies starting from a population of full unconditional defection (NDD). exploration lets the phase lag between NDD (purple) $\rightarrow$ ACD (red)$\rightarrow$ ACC (blue)$\rightarrow$ NDC (yellow). Parameters are set as $r=0.02$, $\gamma=0.1$, $\mu=10^{-5}, 10^{-4}, 10^{-3}, 10^{-2}$, and $10^{-1}$ from the left column to the right column, respectively.
...and 2 more figures

Exploration enhances cooperation in the multi-agent communication system

TL;DR

Abstract

Exploration enhances cooperation in the multi-agent communication system

Authors

TL;DR

Abstract

Table of Contents

Figures (7)