Learning Collusion in Episodic, Inventory-Constrained Markets
Paul Friedrich, Barna Pásztor, Giorgia Ramponi
TL;DR
This work addresses tacit collusion in pricing by modeling episodic, inventory-constrained markets (e.g., airline revenue management) as finite-horizon Markov games. It introduces a formal framework, a collusion measure based on Nash and monopolistic reference prices, and a numerical method to compute these equilibria when closed-form solutions do not exist. Through experiments with PPO and DQN agents, the study shows that learned strategies can converge to collusive pricing under realistic constraints, with PPO achieving higher levels of collusion than DQN. The results demonstrate the potential for RL-driven pricing to exhibit collusion in practical markets and motivate development of regulatory and mitigation strategies. All mathematical notation is presented with explicit delimiters in $...$ as appropriate.
Abstract
Pricing algorithms have demonstrated the capability to learn tacit collusion that is largely unaddressed by current regulations. Their increasing use in markets, including oligopolistic industries with a history of collusion, calls for closer examination by competition authorities. In this paper, we extend the study of tacit collusion in learning algorithms from basic pricing games to more complex markets characterized by perishable goods with fixed supply and sell-by dates, such as airline tickets, perishables, and hotel rooms. We formalize collusion within this framework and introduce a metric based on price levels under both the competitive (Nash) equilibrium and collusive (monopolistic) optimum. Since no analytical expressions for these price levels exist, we propose an efficient computational approach to derive them. Through experiments, we demonstrate that deep reinforcement learning agents can learn to collude in this more complex domain. Additionally, we analyze the underlying mechanisms and structures of the collusive strategies these agents adopt.
