Table of Contents
Fetching ...

Mathematics of multi-agent learning systems at the interface of game theory and artificial intelligence

Long Wang, Feng Fu, Xingru Chen

TL;DR

The cross-fertilization of ideas between both fields will contribute to the advancement of mathematics of multi-agent learning systems, in particular, to the nascent domain of ``collective cooperative intelligence'' bridging evolutionary dynamics and multi-agent reinforcement learning.

Abstract

Evolutionary Game Theory (EGT) and Artificial Intelligence (AI) are two fields that, at first glance, might seem distinct, but they have notable connections and intersections. The former focuses on the evolution of behaviors (or strategies) in a population, where individuals interact with others and update their strategies based on imitation (or social learning). The more successful a strategy is, the more prevalent it becomes over time. The latter, meanwhile, is centered on machine learning algorithms and (deep) neural networks. It is often from a single-agent perspective but increasingly involves multi-agent environments, in which intelligent agents adjust their strategies based on feedback and experience, somewhat akin to the evolutionary process yet distinct in their self-learning capacities. In light of the key components necessary to address real-world problems, including (i) learning and adaptation, (ii) cooperation and competition, (iii) robustness and stability, and altogether (iv) population dynamics of individual agents whose strategies evolve, the cross-fertilization of ideas between both fields will contribute to the advancement of mathematics of multi-agent learning systems, in particular, to the nascent domain of ``collective cooperative intelligence'' bridging evolutionary dynamics and multi-agent reinforcement learning.

Mathematics of multi-agent learning systems at the interface of game theory and artificial intelligence

TL;DR

The cross-fertilization of ideas between both fields will contribute to the advancement of mathematics of multi-agent learning systems, in particular, to the nascent domain of ``collective cooperative intelligence'' bridging evolutionary dynamics and multi-agent reinforcement learning.

Abstract

Evolutionary Game Theory (EGT) and Artificial Intelligence (AI) are two fields that, at first glance, might seem distinct, but they have notable connections and intersections. The former focuses on the evolution of behaviors (or strategies) in a population, where individuals interact with others and update their strategies based on imitation (or social learning). The more successful a strategy is, the more prevalent it becomes over time. The latter, meanwhile, is centered on machine learning algorithms and (deep) neural networks. It is often from a single-agent perspective but increasingly involves multi-agent environments, in which intelligent agents adjust their strategies based on feedback and experience, somewhat akin to the evolutionary process yet distinct in their self-learning capacities. In light of the key components necessary to address real-world problems, including (i) learning and adaptation, (ii) cooperation and competition, (iii) robustness and stability, and altogether (iv) population dynamics of individual agents whose strategies evolve, the cross-fertilization of ideas between both fields will contribute to the advancement of mathematics of multi-agent learning systems, in particular, to the nascent domain of ``collective cooperative intelligence'' bridging evolutionary dynamics and multi-agent reinforcement learning.
Paper Structure (1 section, 1 figure)

This paper contains 1 section, 1 figure.

Table of Contents

  1. Acknowledgements.

Figures (1)

  • Figure 1: Leveraging unbending strategies for steering and stabilizing fairness and cooperation. Shown is co-adaptive learning dynamics between two players in a repeated donation game. The initial strategy used by player X is an extortionate zero-determinant strategy $[p_1, p_2, p_1, p_2]$ with extortion factor $\chi = 2.8$ and that used by player Y is (a)-(c) an unbending strategy $[1, q_2, 0, q_4]$ from Class A or (d)-(f) an unbending strategy $[q_1, q_2, q_1, q_2]$ from Class D (see Ref. chen2023outlearning for detailed classifications). The benefit-to-cost ratio of the donation game is referred to as $r$ and the relative time scale governing the time evolution of the behavioral change of player Y as compared to player X is denoted by $\omega$. In light of both players aiming to maximize their respective payoffs $s_X$ and $s_Y$, the learning curves of (a)-(c) $p_1$, $p_2$, $q_2$, and $q_4$ or those of (d)-(f) $p_1$, $p_2$, $q_1$, and $q_2$ are shown with respect to time. The initial and final payoffs of the two players are given. The circles and the squares stand for player X and player Y, the empty and the solid points represent the initial and the final states, and the arrows indicate the directions of learning. For comparison, three different cases are considered: (a) (d) $\omega \to 0$, (c) (f) $\omega \to 1$, and (b) (e) $\omega$ in between.