Table of Contents
Fetching ...

The Bounds of Algorithmic Collusion; $Q$-learning, Gradient Learning, and the Folk Theorem

Galit Askenazi-Golan, Domenico Mergoni Cecchelli, Edward Plumb, Clemens Possnig

TL;DR

A Folk Theorem-style result is obtained and the set of payoff vectors that can be obtained by these dynamics are characterised, discovering a wide range of possibilities for the emergence of algorithmic collusion.

Abstract

We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics, including $Q$-learning, projected gradient, replicator and log-barrier dynamics. Going beyond the better understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall under different forms of monitoring. We obtain a Folk Theorem-style result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion. Achieving this requires a novel technical approach, which, to the best of our knowledge, yields the first convergence result for multi-agent $Q$-learning algorithms in repeated games.

The Bounds of Algorithmic Collusion; $Q$-learning, Gradient Learning, and the Folk Theorem

TL;DR

A Folk Theorem-style result is obtained and the set of payoff vectors that can be obtained by these dynamics are characterised, discovering a wide range of possibilities for the emergence of algorithmic collusion.

Abstract

We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics, including -learning, projected gradient, replicator and log-barrier dynamics. Going beyond the better understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall under different forms of monitoring. We obtain a Folk Theorem-style result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion. Achieving this requires a novel technical approach, which, to the best of our knowledge, yields the first convergence result for multi-agent -learning algorithms in repeated games.

Paper Structure

This paper contains 25 sections, 12 theorems, 90 equations, 2 algorithms.

Key Result

Theorem 3.3

For all $\varepsilon>0$ there is $\delta^*\in (0,1)$ such that for all $\delta\in (\delta^*, 1)$ and every $u\in \tilde{W}$, there exists $\ell \in \mathbb{N}$ and an $\ell$-recall strict subgame-perfect equilibrium $\pi^*$ of $\Gamma(\delta)$ such that the distance between $u$ and $V(\pi^*)$ is at

Theorems & Definitions (25)

  • Definition 3.1: $\ell$-recall equilibrium
  • Definition 3.2: $\ell$-recall subgame-perfect equilibrium
  • Theorem 3.3
  • Definition 3.4: $\varepsilon$-finite implementation
  • Theorem 4.1
  • Definition 5.1: $q$-gradient
  • Definition 5.2
  • Theorem 5.3
  • Theorem A.1
  • proof : Proof of Theorem \ref{['thm: payoff_approx']}
  • ...and 15 more