Multi-agent reinforcement learning in the all-or-nothing public goods game on networks

Benedikt Valentin Meylahn

Multi-agent reinforcement learning in the all-or-nothing public goods game on networks

Benedikt Valentin Meylahn

TL;DR

The paper studies trust formation in an all-or-nothing public goods game played on networks, using exponential moving average learning to infer neighbor contributions. It proves that, on any connected network, the adaptive learning dynamics converge in the long run to a pure-strategy consensus where all agents either always contribute or always defect, with metastable pre-limit behavior possible on complex networks. Simulations on random geometric graphs reveal that higher network density slows convergence and fosters defecting states, while regular graphs show quicker, non-metastable convergence; local network structure strongly shapes interim trust patterns. The findings imply that promoting global public goods may be more effective when leveraging small, well-connected local groups and considering network topology in designing interventions. Overall, the work links network structure, learning dynamics, and threshold public goods to explain how trust and contribution emerge and stabilize in complex systems.

Abstract

We study interpersonal trust by means of the all-or-nothing public goods game between agents on a network. The agents are endowed with the simple yet adaptive learning rule, exponential moving average, by which they estimate the behavior of their neighbors in the network. Theoretically we show that in the long-time limit this multi-agent reinforcement learning process always eventually results in indefinite contribution to the public good or indefinite defection (no agent contributing to the public good). However, by simulation of the pre-limit behavior, we see that on complex network structures there may be mixed states in which the process seems to stabilize before actual convergence to states in which agent beliefs and actions are all the same. In these metastable states the local network characteristics can determine whether agents have high or low trust in their neighbors. More generally it is found that more dense networks result in lower rates of contribution to the public good. This has implications for how one can spread global contribution toward a public good by enabling smaller local interactions.

Multi-agent reinforcement learning in the all-or-nothing public goods game on networks

TL;DR

Abstract

Paper Structure (21 sections, 3 theorems, 20 equations, 8 figures, 3 tables)

This paper contains 21 sections, 3 theorems, 20 equations, 8 figures, 3 tables.

Introduction
Literature: General
Literature: All-or-nothing, and learning
Research gap and contributions
The model
Base game
Population model of the all-or-nothing public goods game
Agent belief
Agent actions
Public goods on networks
Main theoretical result
Simulation setup
Random geometric network results
Regular graphs, and metastability in geometric graphs
Conclusion
...and 6 more sections

Key Result

proposition 1

For learning rate $\alpha\in(0,1)$ and some $\epsilon\in (0,\alpha/d)$ where $d>0$ is the maximum degree in the graph minus oneNote that for any connected graph with $N>2$ the maximum degree is bigger than 1., if $F(0)=0$, $F(1)=1$, $F(x)\in (0,1)$ for all $x\in (0,1)$ and there exist finite order d

Figures (8)

Figure 1: Illustration of how the players are selected to play a round of the all-or-nothing public goods game on a graph of 8 players. The focal player (encircled by a dashed line) is $i=6$. The players taking part in the game are thus $K_t = N[6] = \{6,5,7,8\}$ (indicated by filled nodes) and so $k_t=4$.
Figure 2: Tail probabilities ($\mathbb{P}(\tau\geq t)$) on a log-log scale for the time to convergence for different values of the radius $r_g$ used in the random geometric graph model to sample networks.
Figure 3: Average final estimate scatter plotted against various network characteristics. Each point is not fully opaque so that darker regions are indicative of more data points.
Figure 4: State at termination of simulation for iterations in which convergence was not reached. This illustrates how the agent belief before absorption in a state of complete or zero trust appears localized, with high trust in regions with few connections, and low trust in regions with many connections.
Figure 5: Tail probabilities ($\mathbb{P}(\tau\geq t)$) (on a log-log scale) for the time to convergence for different values of nearest neighbors $l$.
...and 3 more figures

Theorems & Definitions (3)

proposition 1
lemma 1: Absorption in the corners is possible
lemma 2: Reaching the corners is possible

Multi-agent reinforcement learning in the all-or-nothing public goods game on networks

TL;DR

Abstract

Multi-agent reinforcement learning in the all-or-nothing public goods game on networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (3)