Table of Contents
Fetching ...

Algorithmic Collusion is Algorithm Orchestration

Cesare Carissimo, Fryderyk Falniowski, Siavash Rahimi, Heinrich Nax

TL;DR

This paper reframes algorithmic collusion as a meta-game where firms design the learning algorithms that price in markets. It shows that true collusion-type pricing requires either orchestration (co-training) or coordinated parameterization (co-parameterization) rather than isolated learning. Through a computational study of two Q-learning agents in a Bertrand duopoly, the authors identify Meta Nash Equilibria near competitive prices and a Pareto-front of asymmetric parameterizations that yield higher profits, highlighting regulatory implications for distinguishing algorithm competition from orchestration. The work provides practical tests to detect meta-game collusion and discusses limitations and directions for future research, including demand uncertainty and online learning dynamics.

Abstract

We propose a fresh `meta-game' perspective on the problem of algorithmic collusion in pricing games a la Bertrand. Economists have interpreted the fact that algorithms can learn to price collusively as tacit collusion. We argue instead that the co-parametrization of algorithms, in ways as are necessary to obtain algorithmic collusion, typically requires algorithm designers to engage in some form of explicit collusion or `algorithm orchestration.' In our model, the algorithm designers play a meta-game of parametrizing their algorithms, which then play repeated Bertrand competition. The strategic analysis at the meta-level reveals new equilibrium and collusion phenomena. (JEL: C62, C63, D43, L13)

Algorithmic Collusion is Algorithm Orchestration

TL;DR

This paper reframes algorithmic collusion as a meta-game where firms design the learning algorithms that price in markets. It shows that true collusion-type pricing requires either orchestration (co-training) or coordinated parameterization (co-parameterization) rather than isolated learning. Through a computational study of two Q-learning agents in a Bertrand duopoly, the authors identify Meta Nash Equilibria near competitive prices and a Pareto-front of asymmetric parameterizations that yield higher profits, highlighting regulatory implications for distinguishing algorithm competition from orchestration. The work provides practical tests to detect meta-game collusion and discusses limitations and directions for future research, including demand uncertainty and online learning dynamics.

Abstract

We propose a fresh `meta-game' perspective on the problem of algorithmic collusion in pricing games a la Bertrand. Economists have interpreted the fact that algorithms can learn to price collusively as tacit collusion. We argue instead that the co-parametrization of algorithms, in ways as are necessary to obtain algorithmic collusion, typically requires algorithm designers to engage in some form of explicit collusion or `algorithm orchestration.' In our model, the algorithm designers play a meta-game of parametrizing their algorithms, which then play repeated Bertrand competition. The strategic analysis at the meta-level reveals new equilibrium and collusion phenomena. (JEL: C62, C63, D43, L13)

Paper Structure

This paper contains 21 sections, 5 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Best responses and relevant ratios. $\Theta$ is the parameter space, br is the best response function, arrows represent best responses, and percentages the percentage of $\Theta$ that have a best response in the same region of $\Theta$. Only about $12\%$ of parameters are best responses, the symmetric meta-NEs are best responeses to $30\%$ of the parameter space, and the asymmetric meta-NEs only $0.07\%$. We have an $80:1$ ratio, where about $80\%$ of the best responses are contained in about $1\%$ of the parameter space. These 'top $1\%$' parameters have low exploration rates ($\epsilon$), and low discount factors ($\gamma$).
  • Figure 2: Best Responses in the Meta-Game: this plot encodes best responses as vectors, where the tail starts at the parameters of one player, and the head points to the best response parameters for the other player. The 2D face plots are best responses for payoffs averaged over the third parameter. Similarly, the 1D plots which align with the cube axes average the payoffs over the 2 missing parameters. The figure is explained in greater detail in \ref{['sec:figure_explanation']}.
  • Figure 3: Profit gain comparison between algorithmic design setups. All of our experiments are plotted with tiny gray dots. The meta-NEs are stars (large-blue symmetric, small-red asymmetric), the co-paremetrized Pareto front are crosses (orange) and co-training collusion is red dots enclosed by a box. The co-training collusion values are the range of profit gains achieved in calvano2020artificiala.
  • Figure 4: The Pareto Front parameters where both players are profitable $\Delta \geq 1$. The orange crosses connected by lines represent the Pareto optimal combinations, and the meta-NEs are represented as stars. We can identify two kinds of Pareto optimal combinations: a) where one of the parameters is very close to the meta-NEs, and b) where both parameters are far from meta-NEs.
  • Figure 5: Comparison of fixed exploration and decayed exploration: Player 1 has parameters $\alpha=0.12, \epsilon=0.27, \gamma=0.22$, while Player 2 has parameters $\alpha=0.01, \gamma=0.99$ and $\epsilon$ is decayed from 1 to 0 with decay parameter $\beta=0.4*10^{-5}$.