A Multi-agent Reinforcement Learning Study of Evolution of Communication and Teaching under Libertarian and Utilitarian Governing Systems

Aslan S. Dizaji

A Multi-agent Reinforcement Learning Study of Evolution of Communication and Teaching under Libertarian and Utilitarian Governing Systems

Aslan S. Dizaji

TL;DR

There is a strong evidence that collectivistic environment such as Full-Utilitarian system is more favourable for the emergence of communication and teaching, or more precisely, evolution of language alignment, and some evidence that evolution of language alignment through communication and teaching under collectivistic governing systems makes individuals more advantageously inequity averse.

Abstract

Laboratory experiments have shown that communication plays an important role in solving social dilemmas. Here, by extending the AI-Economist, a mixed motive multi-agent reinforcement learning environment, I intend to find an answer to the following descriptive question: which governing system does facilitate the emergence and evolution of communication and teaching among agents? To answer this question, the AI-Economist is extended by a voting mechanism to simulate three different governing systems across individualistic-collectivistic axis, from full-libertarian to Full-Utilitarian governing systems. Moreover, the AI-Economist is further extended to include communication with possible misalignment, a variant of signalling game, by letting agents to build houses together if they are able to name mutually complement material resources by the same letter. Moreover, another extension is made to the AI-Economist to include teaching with possible misalignment, again a variant of signalling game, by letting half the agents as teachers who know how to use mutually complement material resources to build houses but are not capable of building actual houses, and the other half as students who do not have this information but are able to actually build those houses if teachers teach them. I found a strong evidence that collectivistic environment such as Full-Utilitarian system is more favourable for the emergence of communication and teaching, or more precisely, evolution of language alignment. Moreover, I found some evidence that evolution of language alignment through communication and teaching under collectivistic governing systems makes individuals more advantageously inequity averse. As a result, there is a positive correlation between evolution of language alignment and equality in the society.

A Multi-agent Reinforcement Learning Study of Evolution of Communication and Teaching under Libertarian and Utilitarian Governing Systems

TL;DR

Abstract

Paper Structure (8 sections, 4 equations, 14 figures)

This paper contains 8 sections, 4 equations, 14 figures.

Introduction
The Modified AI-Economist with Communication/Teaching
Results
Final Remarks
Current Limitations
Future Directions
Appendix A: The AI-Economist
Appendix B: Supplemental Figures

Figures (14)

Figure 1: A schematic figure showing the environment of the Modified AI-Economist with Communication/Teaching used in this paper. In all simulations of this paper, there are 6 agents in the environment which simultaneously cooperate and compete to gather and trade four natural resources, using them to build houses alone or together – via communication or teaching – and earn incomes, and at the end of each tax period, pay their taxes to the central planner. The central planner optimizes its own reward function which could be a combination of equality and productivity in the society, and returns an equal division of the total collected taxes to the mobile agents.
Figure 2: Sample plots obtained from running the Modified AI-Economist with Communication under Semi-Libertarian/Utilitarian governing system with equality times productivity as the objective function of the central planner. (A) The environment across five time-points of an episode, (B) the movement of the agents across an episode, (C) the budgets of four resources plus coin and labor of the agents across an episode, (D) the trades of four resources of the agents across an episode, (E) the counted votes of the agents across an episode, (F) and the inequity aversion coefficients of the agents across an episode.
Figure 3: Sample plots obtained from running the Modified AI-Economist with Teaching under Semi-Libertarian/Utilitarian governing system with equality times productivity as the objective function of the central planner. (A) The environment across five time-points of an episode, (B) the movement of the agents across an episode, (C) the budgets of four resources plus coin and labor of the agents across an episode, (D) the trades of four resources of the agents across an episode, (E) the counted votes of the agents across an episode, (F) and the inequity aversion coefficients of the agents across an episode.
Figure 4: The language alignment across an episode for three governing systems of the Modified AI-Economist with Communication. As it is clear from this plot, a collectivistic government such as Full-Utilitarian governing system is more favourable for the evolution of language alignment compared to an individualistic government such as Full-Libertarian governing system. However, full alignment – which is equal to 4 – does not happen under any of these governing systems, thus none of the agents can build houses together.
Figure 5: The language alignment across an episode for three governing systems of the Modified AI-Economist with Teaching. As it is clear from this plot, a collectivistic government such as Full-Utilitarian governing system is more favourable for the evolution of language alignment compared to an individualistic government such as Full-Libertarian governing system. In this case, full alignment – which is equal to 4 – happens under all three governing systems, thus the agents can build houses together under all these governing systems.
...and 9 more figures

A Multi-agent Reinforcement Learning Study of Evolution of Communication and Teaching under Libertarian and Utilitarian Governing Systems

TL;DR

Abstract

A Multi-agent Reinforcement Learning Study of Evolution of Communication and Teaching under Libertarian and Utilitarian Governing Systems

Authors

TL;DR

Abstract

Table of Contents

Figures (14)