Table of Contents
Fetching ...

Quantifying the Self-Interest Level of Markov Social Dilemmas

Richard Willis, Yali Du, Joel Z Leibo, Michael Luck

TL;DR

The paper tackles measuring and inducing cooperation in complex Markov social dilemmas by defining and estimating the self-interest level $s^*$. It extends reward-transfer ideas from normal-form games to Markov games and uses multi-agent reinforcement learning with curriculum training to empirically identify the threshold at which reward exchange promotes cooperative equilibria. Applied to three Melting Pot environments, the method yields $s^*$ values around $0.29$ for Commons Harvest and $0.25$–$0.29$ for Clean Up, while Externality Mushrooms does not exhibit a Markov social dilemma structure, limiting the effectiveness of reward exchange. The work offers a practical metric and mechanism design insight for fostering cooperation in multi-agent systems and suggests avenues for broader applicability and refinement in heterogeneous or larger-scale settings.

Abstract

This paper introduces a novel method for estimating the self-interest level of Markov social dilemmas. We extend the concept of self-interest level from normal-form games to Markov games, providing a quantitative measure of the minimum reward exchange required to align individual and collective interests. We demonstrate our method on three environments from the Melting Pot suite, representing either common-pool resources or public goods. Our results illustrate how reward exchange can enable agents to transition from selfish to collective equilibria in a Markov social dilemma. This work contributes to multi-agent reinforcement learning by providing a practical tool for analysing complex, multistep social dilemmas. Our findings offer insights into how reward structures can promote or hinder cooperation, with potential applications in areas such as mechanism design.

Quantifying the Self-Interest Level of Markov Social Dilemmas

TL;DR

The paper tackles measuring and inducing cooperation in complex Markov social dilemmas by defining and estimating the self-interest level . It extends reward-transfer ideas from normal-form games to Markov games and uses multi-agent reinforcement learning with curriculum training to empirically identify the threshold at which reward exchange promotes cooperative equilibria. Applied to three Melting Pot environments, the method yields values around for Commons Harvest and for Clean Up, while Externality Mushrooms does not exhibit a Markov social dilemma structure, limiting the effectiveness of reward exchange. The work offers a practical metric and mechanism design insight for fostering cooperation in multi-agent systems and suggests avenues for broader applicability and refinement in heterogeneous or larger-scale settings.

Abstract

This paper introduces a novel method for estimating the self-interest level of Markov social dilemmas. We extend the concept of self-interest level from normal-form games to Markov games, providing a quantitative measure of the minimum reward exchange required to align individual and collective interests. We demonstrate our method on three environments from the Melting Pot suite, representing either common-pool resources or public goods. Our results illustrate how reward exchange can enable agents to transition from selfish to collective equilibria in a Markov social dilemma. This work contributes to multi-agent reinforcement learning by providing a practical tool for analysing complex, multistep social dilemmas. Our findings offer insights into how reward structures can promote or hinder cooperation, with potential applications in areas such as mechanism design.

Paper Structure

This paper contains 37 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Pretraining, increasing numbers of players
  • Figure 2: Iteratively decreasing self-interest during training
  • Figure 3: Training without curriculum learning
  • Figure 4: Schelling diagrams