Table of Contents
Fetching ...

Algorithmic Collusion And The Minimum Price Markov Game

Igor Sadoune, Marcelin Joanis, Andrea Lodi

TL;DR

The paper introduces the Minimum Price Markov Game (MPMG), a complete-information, Markovian framework for studying tacit coordination and potential algorithmic collusion in minimum-price, first-price auctions typical of public procurement. By grounding the MPG in a Prisoner’s Dilemma-like structure and extending it to a multi-agent Markov game, the authors probe whether the minimum price rule remains robust under MARL-driven learning across heterogeneous and homogeneous agent populations. Through experiments with MAB, D3QN, and MAPPO agents across 2- and 5-player configurations, they find that tacit coordination can emerge via self-reinforcing dynamics but is generally tempered by market power asymmetries; notably, UCB-based agents more consistently converge toward Pareto-optimal coordination, while more sophisticated agents show mixed results. The work provides a quantitative benchmark for algorithmic pricing in public procurement, informing both regulatory perspectives and future research directions on extending MPMG, relaxing assumptions, and evaluating cyber-cartel risks in AI-driven markets.

Abstract

This paper introduces the Minimum Price Markov Game (MPMG), a theoretical model that reasonably approximates real-world first-price markets following the minimum price rule, such as public auctions. The goal is to provide researchers and practitioners with a framework to study market fairness and regulation in both digitized and non-digitized public procurement processes, amid growing concerns about algorithmic collusion in online markets. Using multi-agent reinforcement learning-driven artificial agents, we demonstrate that (i) the MPMG is a reliable model for first-price market dynamics, (ii) the minimum price rule is generally resilient to non-engineered tacit coordination among rational actors, and (iii) when tacit coordination occurs, it relies heavily on self-reinforcing trends. These findings contribute to the ongoing debate about algorithmic pricing and its implications.

Algorithmic Collusion And The Minimum Price Markov Game

TL;DR

The paper introduces the Minimum Price Markov Game (MPMG), a complete-information, Markovian framework for studying tacit coordination and potential algorithmic collusion in minimum-price, first-price auctions typical of public procurement. By grounding the MPG in a Prisoner’s Dilemma-like structure and extending it to a multi-agent Markov game, the authors probe whether the minimum price rule remains robust under MARL-driven learning across heterogeneous and homogeneous agent populations. Through experiments with MAB, D3QN, and MAPPO agents across 2- and 5-player configurations, they find that tacit coordination can emerge via self-reinforcing dynamics but is generally tempered by market power asymmetries; notably, UCB-based agents more consistently converge toward Pareto-optimal coordination, while more sophisticated agents show mixed results. The work provides a quantitative benchmark for algorithmic pricing in public procurement, informing both regulatory perspectives and future research directions on extending MPMG, relaxing assumptions, and evaluating cyber-cartel risks in AI-driven markets.

Abstract

This paper introduces the Minimum Price Markov Game (MPMG), a theoretical model that reasonably approximates real-world first-price markets following the minimum price rule, such as public auctions. The goal is to provide researchers and practitioners with a framework to study market fairness and regulation in both digitized and non-digitized public procurement processes, amid growing concerns about algorithmic collusion in online markets. Using multi-agent reinforcement learning-driven artificial agents, we demonstrate that (i) the MPMG is a reliable model for first-price market dynamics, (ii) the minimum price rule is generally resilient to non-engineered tacit coordination among rational actors, and (iii) when tacit coordination occurs, it relies heavily on self-reinforcing trends. These findings contribute to the ongoing debate about algorithmic pricing and its implications.
Paper Structure (37 sections, 2 theorems, 20 equations, 6 figures, 2 tables)

This paper contains 37 sections, 2 theorems, 20 equations, 6 figures, 2 tables.

Key Result

Proposition 1

The $n$-player homogeneous MPG is a Prisoner's Dilemma.

Figures (6)

  • Figure 1: One-dimensional heat-map highlighting the relative difference among agents and MPMG configurations in terms of their collusive potential. From the most Nash (left) to the most Pareto (right) configuration.
  • Figure 2: Average joint action frequencies of UCB (a) and MAPPO (b) agents in the $2$-player homogeneous MPMG.
  • Figure 3: Average (across repeats) joint action frequencies over training episodes. Rows (up to down): UCB, $\epsilon$-greedy, TS, D3QN, MAPPO. Columns (left to right): $(n=2, \sigma(\beta)=0.0)$, $(n=2, \sigma(\beta)=0.5)$, $(n=5, \sigma(\beta)=0.0)$, $(n=5, \sigma(\beta)=0.5)$.
  • Figure 4: Average (across repeats) training losses. Rows (up to down): D3QN, MAPPO Actor, MAPPO Critic. Columns (left to right): $(n=2, \sigma(\beta)=0.0)$, $(n=2, \sigma(\beta)=0.5)$, $(n=5, \sigma(\beta)=0.0)$, $(n=5, \sigma(\beta)=0.5)$.
  • Figure 5: Average (across repeats) training policy values. Rows (up to down): D3QN Q-values, MAPPO policy values. Columns (left to right): $(n=2, \sigma(\beta)=0.0)$, $(n=2, \sigma(\beta)=0.5)$, $(n=5, \sigma(\beta)=0.0)$, $(n=5, \sigma(\beta)=0.5)$.
  • ...and 1 more figures

Theorems & Definitions (10)

  • Remark 1
  • Example 1: Hard minimum price rule
  • Example 2: Incentive factor
  • Example 3: Homogeneous special case
  • Proposition 1
  • proof
  • Corollary 1
  • proof
  • Definition 1
  • Definition 2