By Fair Means or Foul: Quantifying Collusion in a Market Simulation with Deep Reinforcement Learning
Michael Schlechtinger, Damaris Kosack, Franz Krause, Heiko Paulheim
TL;DR
This paper investigates algorithmic collusion in a pricing game by deploying multi-agent deep reinforcement learning (DRL) within a scalable oligopoly simulation. It introduces a novel demand framework that blends multiple demand models via a bias parameter $\mu \in [0,1]$ and analyzes two scenarios, including a constrained-observation setting, to study convergence to supracompetitive prices. Using model-free DRL algorithms (PPO and DQN) in a discrete action space, the study shows that agents can converge to collusive-like pricing with oscillatory dynamics, even without direct inter-agent communication, and that these outcomes persist under ablations. The work discusses legal implications under EU and German competition law and argues for policy and regulatory considerations to mitigate algorithmic collusion while highlighting the broader impact on e-commerce pricing strategies.
Abstract
In the rapidly evolving landscape of eCommerce, Artificial Intelligence (AI) based pricing algorithms, particularly those utilizing Reinforcement Learning (RL), are becoming increasingly prevalent. This rise has led to an inextricable pricing situation with the potential for market collusion. Our research employs an experimental oligopoly model of repeated price competition, systematically varying the environment to cover scenarios from basic economic theory to subjective consumer demand preferences. We also introduce a novel demand framework that enables the implementation of various demand models, allowing for a weighted blending of different models. In contrast to existing research in this domain, we aim to investigate the strategies and emerging pricing patterns developed by the agents, which may lead to a collusive outcome. Furthermore, we investigate a scenario where agents cannot observe their competitors' prices. Finally, we provide a comprehensive legal analysis across all scenarios. Our findings indicate that RL-based AI agents converge to a collusive state characterized by the charging of supracompetitive prices, without necessarily requiring inter-agent communication. Implementing alternative RL algorithms, altering the number of agents or simulation settings, and restricting the scope of the agents' observation space does not significantly impact the collusive market outcome behavior.
